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AN INVERSE THEOREM FOR THE GOWERS U^{G) NORM 


BEN GREEN AND TERENCE TAG 

Abstract. There has been much recent progress in the study of arithmetic progres¬ 
sions in various sets, such as dense subsets of the integers or of the primes. One key 
tool in these developments has been the sequence of Gowers uniformity norms U'^{G), 
d = 1, 2,3,... on a finite additive group G; in particular, to detect arithmetic progres¬ 
sions of length fc in G it is important to know under what circumstances the 
norm can be large. 

The U^{G) norm is trivial, and the U‘^{G) norm can be easily described in terms of 
the Fourier transform. In this paper we systematically study the U^{G) norm, defined 
for any function / : G —> C on a finite additive group G by the formula 

WfWuHG) ■■= \G\~'^ ifix)f{x + a)f{x + b)f{x + c)f{x + a + b) X 

x,a,b,cGG 

xf{x -I- & -I- c)f{x + C + a)f{x + a + b + c))^/®. 

We give an inverse theorem for the U^{G) norm on a arbitrary group G. In the finite 
field case G = Fg we show that a bounded function / : G ^ C has large f7®(G) norm 
if and only if it has a large inner product with a function e(0), where e{x) := 6^^^“ 
and 0 : Fg ^ R/Z is a quadratic phase function. In a general G the statement is more 
complicated - the phase (j) is quadratic only locally on a Bohr neighbourhood in G. 

As an application we extend Gowers proof QSI of Szemeredi’s theorem for progressions 
of length 4 to arbitrary abelian G. More precisely, writing r 4 (G) for the size of the 
largest A C G which does not contain a progression of length four, we prove that 

r4(G)« |G|(loglog|G|)-^ 
where c is an absolute constant. 

We also discuss links between our ideas and recent results of Host-Kra and Ziegler in 
ergodic theory. 

In future papers we will apply variants of our inverse theorems to obtain an asymp¬ 
totic for the number of quadruples pi < p 2 < Ps < P 4 ^ N oi primes in arithmetic 
progression, and to obtain significantly stronger bounds for r 4 (G). 


1. Background and Motivation 

A famous and deep theorem of Szemeredi asserts that any set of integers of positive 
upper density contains arbitrarily long arithmetic progressions. More precisely: 

Theorem 1.1 (Szemeredi’s theorem, inhnitary version). [20] Let A be a subset of the 
integers Z whose upper density hmsupjY^f^(2iV -|- n [—N, iV]| is strictly positive. 

Then for any k ^ 1, the set A contains infinitely many arithmetic progressions {a, a + 
r,... ,a + {k — l)r}, r 7^ 0, of length k. 

The second author is supported by a grant from the Packard Foundation. 
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The first non-trivial case of this theorem is when k = 3, which was treated by Roth 
using a Fourier-analytic argument. The case of higher k was more resistant to Fourier- 
analytic methods, and the hrst full proof of this theorem was achieved by Szemeredi 
0 using combinatorial methods. Later, Furstenberg [THl EH] introduced an ergodic 
theoretic proof of this theorem. More recently, Gowers I2ZI gave a proof which was 
both combinatorial and Fourier-analytic in nature, and which is substantially closer in 
spirit to Roth’s original argument than the other proofs. Even more recently there have 
been a number of other proofs of this theorem by other methods, such as hypergraph 
regularity [2H1 EZ] EHl EH EH ESj or “discrete ergodic theory” EH- This theorem and 
its various proofs have in turn generated many other mathematical developments. For 
instance, in |H1|, we were able to apply Theorem 11.11 to demonstrate that the primes 
contain arbitrarily long arithmetic progressions. 

In this paper we shall be interested primarily in the Fourier-analytic approach to this 
theorem, specihcally in the k = 4 case, which was treated separately by Gowers in j2H| 
and then again in EH- This latter paper will be our key reference. However as we shall 
see later there are some strong connections between this approach and the ergodic one, 
especially after the work on characteristic factors by Host and Kra iHnum and Ziegler 
EHEB], and on the connection to nilsequences by Bergelson, Host, and Kra [H]. Before 
we give our main new results, however, we hrst give some further historical background 
and motivation. 

Gowers’ proof of the full Szemeredi theorem in EH is quite lengthy and involves many 
deep new ideas. However, it is possible to split it up into a number of simpler steps, all 
but one of which are straightforward. Firstly, it is easy to show that for any hxed fc. 
Theorem o is equivalent to the following hnitary version. 

Theorem 1.2 (Szemeredi’s theorem, hnitary version IHHj)- Let 5 > 0 and k ^ 1. Then 
there exists an integer Nq = No{6,k) such that whenever N ^ Nq and A C [l,iV] is 
such that |y4|/|[l,iV]| ^ 6, then A contains at least one proper arithmetic progression of 
length k. 

The next observation, due to Roth, is that one can hope to prove this theorem by 
downwardly inducting on the density parameter 5 (the case S ^ 1 being trivial or 
vacuous). In particular, for any hxed k, Theorem 11.21 is equivalent to the following 
assertion. 

Theorem 1.3 (Lack of progressions implies density increment). Let 5 > 0 and k ^ 1. 
Let N ^ 1, and let A C [l,iV] be such that |H|/|[l,iV]| ^ S, and such that A contains 
no proper arithmetic progressions of length k. Then, if N is sufficiently large depending 
on k and 6, there exists an arithmetic progression P C [l,iV] with |P| ^ uj{N,S) for 
some function uj{N, 5) of N which goes to infinity as N ^ oo for each fixed 5, such that 
we have the density increment |HnP|/|P| ^ S + c(S), where c(S) > 0 is a function of 
6 which is bounded away from zero whenever 5 is bounded away from zero. 

The deduction of Theorem o from Theorem o is a straightforward induction argu¬ 
ment. For details of arguments of this type any of OEHEIinilEil may be consulted, 
or indeed HTIorfTTlof this paper. Of course, the hnal bound Nq[ 5, k) obtained in Theorem 
1 1.21 will depend on the explicit bounds a;(iV, 5), c(6) obtained in Theorem 1 1.31 In Roth’s 
k = 3 argument in jHlj, c{5) was roughly and uj{N,6) was roughly which led 
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to a final bound of the form No{6, 3) ^ exp (exp ((7/5)). In Gowers’ extension of Roth’s 
argument in [211, roughly 6^^ and uj{N, 5) was roughly N'^'^ for some Ck,Ck > 0 

depending only on k, which led to a hnal bound of the form Nq{S, k) ^ exp(exp(Gfc/5‘"'')) 
(see ra for a more precise statement). These are the best known bounds for k) 
except in the k = 3 case, where the current record is Nq{6,3) ^ (G/5)'"/'^^, due to 
Bourgain nn- 


The next step is to pass from the interval [1, N] to a cyclic group 'LjN'L for some prime 
N. Indeed, by using Bertrand’s postulate^ and a simple covering argument to split 
progressions in Z/A^Z into progressions in [l,iV], one can show that Theorem 11.31 is in 
turn equivalent for each hxed k (up to minor changes in the bounds uj{N,S) and c(5)) 
to the following statement. 


Theorem 1.4 (Lack of progressions implies density increment). Let 5 > 0 and k ^ 1. 
Let N ^ 1 be a prime, and let Q he a proper progression in Z/iVZ such that |(5| ^ cqN 
for some 0 < cq ^ 1. Let A O Q be such that |R| ^ 6N, and such that A contains no 
proper arithmetic progressions of length k. Then, if N is sufficiently large depending on 
k and 6, there exists a proper arithmetic progression P C 'Ll NT, with \P\ ^ <v(iV, 6, Cq, k) 
for some function uj{N, 6 , Cq, k) of N which goes to infinity as N —>■ oo for each fixed 6 , 
Cq, k, such that we have the density increment 


|Rn F 

\p\ 




\AnQ\ 

IQI 


+ c(5,co, k), 


where c(5, Cq, k) > is bounded away from zero whenever 6, Cq are bounded away from 
zero and k is fixed. 


The deduction of Theorem o from Theorem oi is not difficult, see [23 EH EH- Of 
course, it remains to prove Theorem II .41 This was achieved in the k = 3 case by Roth 
using Fourier-analytic methods. To extend these arguments to the case of higher k, 
Gowers introduced a collection of tools which form a part of a theory which might be 
termed “higher-order Fourier analysis” for reasons which will become clear later. In 
particular, to handle the k = A case required “quadratic Fourier analysis”. 


While Gowers’ original argument takes place in a cyclic group LjNL of prime order, we 
will work in the more general setting of an arbitrary hnite additive group. This might 
seem unnecessary, but is consistent with what we call the finite field philosophy. This 
is the observation that many questions concerning the integers {1 ,..., or the cyclic 
group LjNL may be asked very naturally for an arbitrary hnite abelian group G, and 
they may be answered there by modifying the proof for LjNL in a straightforward way. 
Thus it is often the case that the passage 

Z/AfZ G 

is rather straightforward. 


However, it may be that the question is signihcantly easier to answer when G is some 
specihc group, typically a vector space over a hnite held such as F 2 ,F 3 or Fg. This 

^that is, there is always a prime between X and 2X 
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observation was made in such papers as mi EH- In this paper we will take a partic¬ 
ular interest in Fg since this is the smallest characteristic held for which arithmetic 
progressions of length 4 are a sensible thing to discuss. Now the passage 


F" 


generalization 


G 


might not be at all easy. However, in attempting such a route one has split the problem 
into two presumably easier subproblems, and furthermore there is now a library of tools 
available for effecting the generalization. This started with the work of Bourgain uni 
(though he did not phrase it this way), and has continued with various works such as 
[aniiS2]. The present paper, particularly IJHland IJHI is another example in this vein. For 
a longer discussion of the hnite held philosophy, see inn. 


Definition 1.5 (Additive groups). Dehne an additive group to be a group G = (G, -|-) 
with a commutative group operation if x G G and n E we can dehne the product 
nx E G in the usual manner. If / : G —hf is a function from one additive group 
to another, and h E H, we dehne the shift^ operator applied to / by the formula 
T’^f{x) := f{x + h), and the diherence operator h ■ V ;= — 1 applied to / by the 

formula (h ■ V)/(x) := f{x + h) — f{x). We extend these dehnitions to functions of 
several variables by subscripting the variable to which the operator is applied, thus for 
instance if f{x,y) is a function of two variables we dehne T^f{x,y) = f{x + h,y) and 
h-Vx/ix, y) = /(x-fh, y) — f{x, y) if x, h range inside an additive group G, and similarly 
for the y variable. 


Remark. Throughout the paper, we will write N := |G| for the cardinality of G. 


Remark. The notation above is of course designed to mimic that of several variable 
calculus. We caution however that we do not assign any independent meaning to the 
symbol V, unless it is prepended with a shift h to create a diherence operator h ■ V, 
which is of course a discrete analogue of a directional derivative operator. 

We now introduce a multilinear form Ak{fi, ■ ■ ■, fk) which is useful for counting arith¬ 
metic progressions. Here, and throughout the paper, it is convenient to adopt the 
notation of conditional expectation, which allows one to hide some distracting nor¬ 
malizing factors such as 1/N in our arguments. Thus if / : G —> C is a complex¬ 
valued function on a hnite set G, and H C G is a non-empty subset of G, we will 
use Ea;gB/(x) := JZxgb /(^) denote the average of / over B. We will abbreviate 
Kx^cfix) as E(/) when the domain G of / is clear from context. 


Now if G is a hnite additive group and /o,..., fk-i : G —> C are complex-valued 
functions, we dehne the fc-linear form Ak{fo ,..., fk-i) G C by 

Afc(/o, • • • , fk-l) ■■= Ex,r^Gfoix)T^fiix) ... T(^-i)7fc-i(x). 

Observe that if A C G and /o = ... = fk-i = 1a, where 1 a : G —>• {0,1} denotes the 
indicator function of A, then Afc(lA,..., 1a) is just the number of progressions of length 
k (including those with common diherence 0), divided by the normalizing factor of N'^. 
In particular, if {N, (A; — 1)!) = 1 and A contains no proper progressions of length k 
then we see that Afc(l^,..., 1^) = |A|/A^^, which will be quite small when N is large. 

^This “ergodic” notation corresponds to the backwards shift T^x := x — h on the underlying group 
G. We will discuss further connections with ergodic theory in m 
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It is thus of interest to determine under what conditions Afc(l^,..., 1^) is small or large. 
To this end, Gowers introduced (what are now known as) the Gowers uniformity norms 
ll/ll{/<*(G) for any complex function / : G —> C, whose definition we now recall. 

Definition 1.6 (Gowers uniformity norm). Let d ^ 0, and let / : G —> C be a function. 
We define the Gowers uniformity norm ||/||; 7 d(G) ^ 0 of / to be the quantity 

||/||„.,G| := n 

where 00 = (cui,..., cu^), he (hi,..., h^), u ■ h := Uihi + ■ ■ .+u!dhd, |a;| := Ui + ■ ■ ■ + uJd, 
and C is the conjugation operator Cf{x) := f{x). 

Remark. An equivalent dehnition of the U'^{G) norms is given by the recursive formulae 

II/IIg«(G) = E(/)i ||/||„.,G, = |E(/)|i II/IIg. - (EteGllry/IIJt-'.iG,(1.1) 

for all d ^ 1. 

Remark. A configuration of the form {x ui ■ is called a cube of dimension d. 

Thus is a weighted average of / over cubes; for instance, ||lA||^'d(G) is equal to 

the number of cubes contained in A, divided by the normalizing factor of The 

cases d = 0,1 are rather degenerate, and indeed Lh^{G) is not a norm in these cases. 
However for d > 1, one can show that || • ||; 7 d(G) is indeed a norm, i.e. it is homogeneous, 
non-negative, non-degenerate, and obeys the triangle inequality, see m Lemma 3.9]. 
These norms have also appeared recently in ergodic theory, see for instance |1T], and 
(together with the dual norms U'^{G)*) played a key role in [2l]. It thus seems of interest 
to study these norms more systematically; the results here can be viewed as a step in 
that direction. 

We will study these norms in detail later, but for now let us give an example to illustrate 
what they are trying to capture. Suppose / has the form f{x) := e(0(a;)) for some phase 
function 0 : G —>■ M/Z, where e : M/Z —>■ C is the exponential map e{x) := Then 

a simple calculation shows that 

\\f\C = ■ V.)... {hd ■ V.)0(x)). 

Thus the norm is in some sense measuring the oscillation present in the d*^ “de¬ 
rivative” of the phase. In particular, we expect the norm to be large if the phase 
behaves like a “polynomial” of degree d — 1 or less, but small if the phase is behaving 
like a polynomial of degree d or higher. 

We observe that as an immediate consequence of (HID and induction we have the 
monotonicity property 

ll/llt/'^(G) ^ ll/ll[/'i+i(G) for d = 0,1, 2,- (1.2) 

The relevance of the Gowers uniformity norms to arithmetic progressions lies in the 
following result, which was stated explicitly in m Theorem 3.2] (in the case of cyclic 
groups G = 'L/N'L) but has been implicit in the ergodic theory literature for some time. 
Write P := { 2 ; G C : | 2 ;| ^ 1} for the unit disk. 
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Proposition 1.7 (Generalized von Nenmann Theorem). Let G be a finite abelian group 
with {N, (fc — 1)!) = 1. Let fo, ■ ■ ■, fk-i : G ^ V be functions. Then we have 

|Afc(/o, ^ rnin \\fj\\u^-i(G)- 

It is instructive to continue with the phase example given earlier. If fj = e(0j), then 

Afc(/o, ■ ■ ■, fk-i) = E^,reGe{(j)o{x) + fifix + r) + ... + fik-fix + {k - l)r)). 

Thus Afc(/o,...,/fc_i) is measuring the oscillation present in the expression <po{x) + 
0i(x+r) + .. .+(j)k-i{x+{k — l)r). Proposition ll.TI can then be viewed as a statement that 
if this expression does not oscillate, then neither do the expressions (hi ■ Va,)... {hk-i ■ 
Vx)4>j{x) for any 1 ^ j ^ k. Note that such a fact morally follows by “differentiating” 
the expression fioi^x) + 0i(x + r) + ... + fikix + {k — l)r) in h — 1 different directions 
to eliminate all but one of the terms in this series. For completeness we give a proof of 
this Proposition in Sectional 

Corollary 1.8 (Lack of progressions implies large uniformity norm j22j). Let k ^ 3, 
let G be a finite additive group with [N, {k — 1)!) = 1, and let A O G, |A| = aN, 
be a non-empty set such that A has no proper arithmetic progressions of length k. If 
N ^ then we have ||1a — ^ 

More generally, let P be a proper arithmetic progression in G such that |P| ^ CqN. Let 
A C P, |a4| = a|P|, be a non-empty set which contains no proper arithmetic progres¬ 
sions. If N > NQ{co,k,a) then we have ||lyi — alp\\jjk-\(^Q-^ ^ c(co,a,h) > 0, where 
the quantity c(co, a, k) stays bounded away from zero when cq, a are bounded away from 
zero and k is fixed. 

Proof. We begin with the first claim. Since = a + (It — «), we can split the 
expression Afc(lyi,..., 1^) as the sum of 2^ expressions, one of which is Ak{a ,...,«), 
and the other 2^~^ of which can be bounded in magnitude by ||1 t — ct||c/'=-i(G) thanks 
to Proposition 11.71 In particular we conclude that 

|Afe(a,...,«) — Afe(lA, ■ ■ ■, 1 t)| ^ 2^||1a — a||c/fc-i(G)- 

But clearly Ak{a,... ,a) = a^, while since A has no proper arithmetic progressions we 
have Afc(lA,..., 1^) = |A|/iV^ = a/N ^ a^/2. The first claim follows. 

The second claim proceeds similarly but is based upon the decomposition = alp + 
(1^ — alp), and the observation that Afc(lp,..., Ip) ^ c(co, A;) > 0 for some positive 
quantity c(co, k) depending on cq and k] we leave the details to the reader. □ 

Comparing this Proposition with Theorem 11.41 we thus see that in order to prove 
Szemeredi’s theorem for a fixed k, it suffices to prove the following: 

Theorem 1.9 (Large uniformity norm implies density increment (211 )• Let rj > 0 and 
k ^ 3. Let G = Z/iVZ be a cyclic group of prime order, and let f : G —>■ P be a real¬ 
valued bounded function such thaPE^f) = 0 and ||/||{/fe-i(G) ^ V- Then, if N > No^k,!]), 
there exists a proper arithmetic progression PPG with |P| ^ u}{N,ri,k) such that 
Ex^pf{x) ^ c{r],k), where 

• uj{N, T]) —> oo as N ^ oo for fixed rj; 

• c(? 7 , k) > 0 is bounded away from zero when rj is bounded away from zero and k 
is fixed. 
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Indeed, Theorem II.41 then follows by applying Corollary II.81 and then invoking Theorem 
fnn with f := 1 a — (]Ea,gQlyi(a;))lQ. Theorem 11.91 is in fact deduced in |2Zj from the 
following stronger theorem: 

Theorem 1.10 (Weak inverse theorem for 17^“^(Z/iVZ) |27]). Let rj > 0 and k ^ 3. 
Let G = ’LjN'L he a cyclic group of prime order, and let f : G ^ T> be a bounded 
function such that ||/||i7fc-i(G) ^ V- Then, if N ^ exp{Gkri~'"'^) for some sufficiently 
large Gk > 0, one can partition Z/iVZ into arithmetic progressions {Pj)j^j, each of size 
\Pj\ ^ Ckr]~‘"'‘ for some Ck,Gk > 0, such that 

|E(/lp,)| S 

for some Ck, Gk > 0. 

Theorem 11.91 (and hence Theorem EH) follows quickly from this and the mean zero 
hypothesis 'ffj^j'E^flp.) = E(/) = 0. Indeed one gets a fairly good quantitative result 
for Theorem El with Nq = exp(exp(C'fc5 for some explicit Cfc > 0; see p7j . 

We refer to Theorem ll.lOl as a weak inverse theorem because it gives a necessary criterion 
in order for a bounded function / to have large Lf^~^{G) norm, and hence a sufficient 
condition for the U^~^{G) norm to be small. As discussed above, this theorem is strong 
enough to imply Szemeredi’s theorem. Also, Theorem 11 .1 01 could potentially be useful, 
when combined with such tools as Theorem II .71 for not only demonstrating the existence 
of progressions of length fc in a given set A, but in fact providing an accurate count 
as to how many such progressions there are. For instance, one might hope to count 
the number of progressions of length k in the primes less than N by using Theorem 
EH to show that a certain counting function / associated to the primes has small 
U^~^{X/NX) norm and hence its contribution to the count of progressions in the primes 
could be controlled using Theorem 11.71 However, the sufficient condition for smallness 
of U^~^(fL/N'L) given by Theorem 11.101 is very difficult to verify for sets such as the 
primes (being at least as difficult as the Elliott-Halberstam conjecture, which is not 
known to be implied even by the GRH). 

It is thus of interest to obtain a better inverse theorem for the Lf^~^{'L/N'L) norm, which 
gives a more easily checkable condition for when this norm is small. Ideally we would 
like this condition to be both necessary and sufficient, at least up to constant losses. In 
this paper we shall achieve these objectives for k = A. 

In subsequent work we will give various applications of the results and methods of this 
paper. In (SHI we obtain a new bound on the size of the largest subset of the vector 
space Fg with no 4-term arithmetic progression, and we hope to generalize that result 
to arbitrary abelian groups G. In another series of papers, we will obtain an asymptotic 
formula for the number of quadruples pi < p 2 < P 3 < P 4 ^ N of primes in arithmetic 
progression. 

2. Inverse theorems for norms 

We have now motivated why we are interested in an inverse theorem for the Lf^~^ norms. 
Before we state our main theorems, let us give some other examples and results which 
will illustrate what the inverse theorem should be. Recall that the LL^ norm of a function 
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e(0) measures the oscillation in the derivative of the phase 0. Also recall that a 
polynomial of degree at most d — 1 is a function whose derivative vanishes. We 
generalize this concept as follows. 

Definition 2.1 (Locally polynomial phase functions). If B is any non-empty subset of 
a hnite additive group G and d ^ 1, we say that a function 0: B —^-M/Zisa polynomial 
phase function of order at most d — 1 locally on B if we have 

(hi (hd+i ■ V^)(f(x) = 0 

whenever the cube (x -|- cuihi u;dhd)cji,...,cjaG{o,i} is contained in B. If / : i? —> C 

is a function, we define the local polynomial bias of order d on B i|/||ud( 5 ) to be the 
quantity 

\\f\\u‘iiB) ■= sup \E^^B{f{x)e{-(j){x)))\ 

where (j) ranges over all local polynomial phase functions of order at most d — 1 on 5. 

To begin with we will work in the global setting B = G, but as will become clear later we 
will need to also work in the local setting. We will refer to polynomial phase functions 
of degree at most 1 as linear phase functions, and of degree at most 2 as quadratic phase 
functions, with the modihers “local” or “global” as appropriate. 

The quantity is clearly a seminorm. It shares several features in common with 

the U'^{G) norm. First of all, like the U’^{G) norm, we have the monotonicity ||/||n‘*(s) ^ 
ll/IU-i+qB)) uud when B = G we also have the shift invariance \\T^f\\u<i(G) = ll/lln‘*(G)- 
We also have the conjugation symmetry i|/i|„d(^) = ||/||„d( 5 ), and the phase invariance 
||/e( 0 )||„d( 5 ) = ||/||„d(s) whenever 0 is a locally polynomial phase of degree at most 
d — 1 on i?. The latter invariance also extends to the If^^G) norm, thus 

||/e(0)||[/d(G) = ||/||{/rf(G) (2.1) 

whenever 0 :G—>M/Zisa global polynomial phase function of degree at most d — 1 . 
Indeed, this invariance^ can easily be seen from (HH) and induction, using the fact that 
the derivative of a polynomial of degree at most d — 1 is a polynomial of degree d — 2. 

From this invariance and (HH), (II2D we conclude that 

WfWuHG) = \\fe{-(p)\\uHG) > ll/e(-0)||c/i(G) = \E^&G{f{x)e{-(j){x)))\ 

whenever 0 is a global polynomial phase of degree at most d — 1. Taking suprema over 
all (j), we obtain the inequality 

ll/llc/rf(G) ^ \\f\\ud{G) (2.2) 

for all d ^ 1, all additive groups G, and all / : G —C. 

It is now natural to ask whether the inequality (jZl can be reversed. When d = 1 
it is easy to verify (using p.ljl and the fact that polynomials of degree at most 0 are 

^This polynomial phase invariance also indicates why Fourier analysis - which is essentially invariant 
under modulation by linear phase functions but not by quadratic or higher phases - is only able to 
effectively deal with the U‘^{G) norm and not with higher norms. To deal with the U^{G) norm 
thus requires some sort of “quadratic Fourier analysis” which is insensitive to phase modulations by 
quadratic phases. The results here can be viewed as some preliminary steps towards establishing such 
a quadratic Fourier analysis theory. 
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constant) that we in fact have eqnality: 

||/||h 1(G) = ||/IUi(G)- 

Consider next the case d = 2. For this we need the Fonrier transform. Let G be the 
Pontryagin dnal of G, in other words the space of homomorphisms ^ : x ^ ^ ■ x from G 
to M/Z. As is well known, G is an additive gronp which is isomorphic to G. If ^ G G, 
we dehne the Fourier coefficient f{f) of / at the freqnency f by the formnla 

/(O =^xf{x)e{-f ■ x). 

As is well known, we have the Fourier inversion formula 

/(^) = 

«6G 

and the Plancherel identity 

E(l/P) = 5^l/tt)P. (2.3) 

«6G 

One can then easily verify the pleasant identity 

II/IIFg) = EI/(OI‘- (2.4) 

?6G 

For instance, this can be achieved by hrst establishing the identity i |/||^2 = IE(|/*/p)^, 
where / * f{x) := ¥.yf{y)f{x — y) is the convolntion of / with itself, and then nsing 
(jzni). Next, we make the easy observation that if0:G—^-R/Zisa global polynomial 
phase fnnction of degree at most 1, then x i—*• 4>{x) — 0(0) is a homomorphism from G 
to R/Z, and hence there exists f, E G snch that (j){x) = f ■ x + 0(0). From this it is easy 
to see that 

II/L 2 (g) =snp 1/(01- (2.5) 

«6G 

Combining dZl, dZl, (|Zi) we readily conclnde 

Proposition 2.2 (Inverse theorem for U‘^{G) norm). Let f : G ^ V be a bounded 
function. Then 

II/Lfg) ^ II/IIc/2(g) ^ ll/lll2(G)- 

We remark that this Proposition easily implies the k = 3 case of Theorem 11.101 (nsing 
Dirichlet’s theorem on approximation by rationals to cover G by progressions on which 
0 is close to constant), and hence also implies Szemeredi’s theorem for k = 3. Indeed 
this is essentially Roth’s original argnment [SI] , albeit phrased in very modern language. 

Based on evidence such as Proposition l2.21 one is tempted to conjecture that the U‘^{G) 
and u'^{G) norms are also related for higher d, in the sense that if / is bounded and 
one of the two norms ||/||{/<*(g)) ||/IU‘'(g) is small, then the other is also. From ()2.2j) 
we already know that one direction is true: smallness of the U^{G) norm implies the 
smallness of the u'^{G) norm. Our hrst main result establishes a converse to this in 
the d = 3 case when G = Fg, though with only partially satisfactory control on the 
constants. 
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Theorem 2.3 (Inverse theorem for [/^(F^)). Let / : Fg —> "D 6e a bounded function 
and let 0 < r] ^ 1. 

(i) If ||/||f/3(Fg) ^ 1], then there exists a subspace IT ^ Fg o/ codimension at most 
{2/r])^ such that 

^y&^\\f\\u3{y+w) ^ (h/2)‘^, (2.6) 

where we can take C = 2^®. In particular, there exists y & G such that 
\\f\\u^y+W) > (h/2)‘^. 

(ii) Conversely, given any subspace IT ^ Fg and any function / : Fg —>• C we have 
ll/llt/3(F") ^ ||/|U3 (f-) ^ 5-^\W\\\f\\,,3(^y+w) for any y e F^. 


Combining the two parts of the theorem together we see that 


3(F^) 




ll/llf/3(Fg 




c 


iog''(i + i/||/IU3(f^)) 


(2.7) 


for some absolnte constants c,C > 0. Thus this does give a result which asserts that 
the smallness of the [/^(Fg) norm implies the smallness of the ^^(Fg) norm and vice 
versa, although the dependence of constants is poor^. Note however that the control 
is much better if one localizes the quadratic bias norm ^^(Fg) to cosets y + W oi IT. 
We remark that there is nothing particularly special about the hnite held Fg, and one 
has similar results for any other hnite held of odd characteristic, though the constants 
depend of course on the held. 


We shall prove Theorem 12.31 til in ® ((ii) is easier, and we will prove it in it contains 
many of the main ideas of this paper, which combine the Fourier and combinatorial 
analysis of Gowers in [221 with an additional “symmetry argument” which is necessary 
to obtain a strong inverse theorem instead of a weak inverse theorem. As a consequence 
we obtain, in m a Szemeredi theorem for progressions of length 4 in Fg. 


Let us now discuss hnite abelian groups G in general, particular importance being 
attached to IlNX on account of potential applications. It is tempting to conjecture, 
in light of the preceding results, that in such groups, any bounded function with small 
u'^{G) norm must necessarily have small U'^{G) norm. Unfortunately, such a statement 
is false, even for G = ’LjN'L. This fact was essentially discovered by Furstenberg 
and Weiss |21], in the closely related context of determining characteristic factors for 
multiple recurrence in ergodic theory; a similar observation was also made in page 487 
of j2Zj. See ^121 for some further discussion of this connection. We give one instance of 
the Furstenberg-Weiss example as follows: 

Example 2.4. Let be a large prime number, and let M be the largest integer less 
than a/IV. Let G := 'Ll NT,, and let / : G —> C be the bounded function dehned by 
setting f{yM + z) := e{yz/M)fj{y/M)f){z/M) whenever —M/10 ^ y,z ^ M/10, and 
/ = 0 otherwise; here : M —>■ is a non-negative smooth cutoh function which 
equals one on the interval [—1/20,1/20] and vanishes outside of [—1/10,1/10]. Then a 
direct calculation shows that ||/||;73 (g') ^ cq for some absolute constant cq > 0, basically 

^We conjecture that one can improve the upper bound in to a polynomial dependence (bringing 
this estimate in line with Proposition 12.21 see ini for further discussion. 
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because all the phases in the expression for ||/||^ 3 (c.) cancel out leaving only the non¬ 
negative cutoffs Ip, whereas a Weyl sum computation reveals that E(/e(—0)) = 0{N~^) 
for any quadratic phase function cp and some explicit constant c > 0. (Note that when N 
is prime, the only quadratic phase functions (p are those of the form (p{x) = ax^ + hx + c 
where a,b,c & M/Z with Na = Nb = 0; see Tjemma fd.l jl . We omit the details. 

The heart of the difficulty here is that the function yM + z yz/M is locally quadratic 
on the set B := {yM + z : —M/10 ^ y,z ^ M/10}, which is a fairly large subset 
of G, but does not extend (even approximately) to a globally quadratic phase function 
on all of G. These locally quadratic phase functions are thus a genuinely new class 
of obstructions to having small U'^{G) norm which must now also be accounted for in 
order to produce a genuine inverse theorem for the U^{G) norm. Similar considerations 
also apply, of course, to the U‘^{G) norms for d > 3. 

We must therefore understand the proper generalization of sets such as B = {yM + z : 
—M/10 ^ y,z ^ M/10}. It turns out that there are two ways to obtain such a 
generalization, which are in a sense dual to one another, namely that of generalized 
arithmetic progressions and that of Bohr sets. For technical reasons it is convenient 
to work in the hrst instance with the latter notion, but we will discuss generalized 
arithmetic progressions later in the paper. 

Definition 2.5 (Bohr sets). Let G be a finite additive group, and let S' C G, 151 = d 
be a subset of the dual group. We define a sub-additive quantity ||||s on G by setting 

\\x\\s ■= sup 11^ ■ x||k/z, 

where ||x||r/z denotes the distance to the nearest integer, and dehne the Bohr set 
B{S, p) ^G for any p > 0 to be the set 

B{S,p) := {x G G : ||x||s < p for all ^ G 5}. 

Note that the dependence of the Bohr set 5(5, p) on p can be rather discontinuous , as 
can be seen rather dramatically in finite held geometries such as Fg. This is inconvenient 
in applications, but fortunately it was noted by Bourgain® HU that one may restrict 
attention to “regular” Bohr sets which enjoy some limited continuity properties in p. 

Definition 2.6 (Regular Bohr sets jlHI)- Let S C G, |5| = d, be a set of characters, 
and suppose that p G (0,1). A Bohr set B{S,p) is said to be regular if one has 

(1 - 100d|K|)|5(5,p)| ^ |5(5, (1 + k)p)\ ^ (1 + 100d|«:|)|5(5,p)| 

whenever \k\ ^ 1/lOOd. 

Lemma o gives a plentiful supply of regular Bohr sets. The constant 100 can be 
lowered but this will not concern us here. With this dehnition in place, we can now give 
the generalization of Theorem 12.,01 to arbitrary groups. 

Theorem 2.7 (Inverse theorem for f/^(G)). Let G be an finite additive group of odd 
order, let f : G ^ V be a bounded function and let 0 < p ^ 1. 

^In fact rather earlier Gowers P7I Lemma 10.10] employed an argument which establishes that all 
Bohr sets in IjN'L, N prime, are regular in a very weak sense. 
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(i) If ||/||c/3(g') ^ then there exists a regular Bohr set B := B{S,p) in G with 

151 ^ and p ^ such that 

E!,6gII/IU.(,+b) > (2.8) 

where it is permissible to take C = 2^^. In particular, there exists y & G such 
that \\f\\u3(y+B) > (W2)^- 

(ii) Conversely, B = B{S,p) is a regular Bohr set, if f : G ^ T> is a bounded 
function and if \\f\\v?{y+B) ^ h, then we have 

WfWuHG) > iv^pVG'dY 
for some absolute constant G'. 

Note that the u^{G) norm is no longer involved in this inverse theorem; this is necessary 
as demonstrated by Example 12.41 and has to do with the lack of extendibility of some 
local qnadratic phases to global ones. In later sections we will prove other, related, 
inverse theorems for the U^{G) norm. In iini we will obtain a resnlt in which the 
quadratic phases are given quite explicitly when G = 'LjNTj. Then, in Theorem 112.81 
we will provide a link to recent ergodic-theoretic work of Host-Kra and Ziegler . 

In a future series of papers we will prove an enhanced version of Theorem 12.71 and use 
it to establish an asymptotic for the number of quadruples pi < p 2 < Ps < Pi ^ of 
primes in arithmetic progression. The enhancement required is that we must be able 
to deal with functions / : Z/iVZ — C which are not necessarily bounded, in particular 
functions such as / = A — 1, where A is the von Mangoldt function. Once this is done, 
one may analyse the norm of / using what are, in essence, rather classical methods 
of analytic number theory such as Vaughan’s decomposition of A. 

A word of reassurance is perhaps in order for the reader interested in this result con¬ 
cerning primes. Although the present paper is long, only a few sections of it, namely 
sections IHElIHlandini are relevant to that work. In fact it is hoped that the subsequent 
papers on primes will be readable largely independently of the present work. 

Let us briefly mention the connection between the results here and those in inn. In that 
paper the second author introduced the concept of a uniformly almost periodic function 
of order k — 2, which generalized the concept of a polynomial phase of order at most 
k — 2 (and which incorporates the “locally polynomial phases” discussed above. One also 
obtained (by very elementary means) an inverse theorem for the 17^“^ norms involving 
these uniformly almost periodic functions, see j6H Lemma 5.11]. However, because 
the uniformly almost periodic functions are a larger class than the locally polynomial 
phases, those results are weaker than the inverse theorems presented here. Nevertheless, 
with substantial additional effort (involving for instance the van der Waerden theorem) 
it is possible to use the inverse theorem for uniformly almost periodic functions to obtain 
another proof of Szemeredi’s theorem. See EH for more details. We also remark that 
very similar objects (the anti-uniform functions) were also utilized in |31] in order to 
reduce the task of establishing arbitrarily long progressions in the primes to Szemeredi’s 
theorem. 

Finally, let us offer a word of explanation for our policy concerning constants. For many 
of the arguments of this paper we have supplied exact constants, eschewing excessive use 
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of the 0-notation. This perhaps allows one to better see how bounds from different lem¬ 
mas combine with one another to influence later bounds. Some readers may, however, 
prefer to replace such quantities as with Cr]~'" when reading the paper. 

3. A MODEL PROBLEM: GLOBAL QUADRATIC PHASE FUNCTIONS 

We now present a simple result, namely the classihcation of globally quadratic phase 
functions on an arbitrary additive group G of odd order, which we will need later, and 
which will serve to illustrate our strategy for the more advanced results we give below. 

Let us call a homomorphism M : G ^ G self-adjoint if we have 

Mx ■ y — My ■ a; = 0 for all x,y E G. 

Lemma 3.1 (Inverse theorem for globally quadratic phase functions). Let G be a finite 
additive group of odd order, and let : G ^ 'K/Z, be a globally quadratic phase function. 
Then there exists c E M./Z, f E G, and a self-adjoint homomorphism M ■. G ^ G such 
that (j){x) = Mx ■ X -\- f ■ X -\- c. Conversely, all such functions x i—> Mx ■ x -\- f ■ x -\- c are 
globally quadratic phase functions. 

Proof. The converse is easy, so we focus on the forward direction. It is convenient to 
adopt the notation p(xi,...,x„) to denote an arbitrary function of variables Xi,... ,Xn 
which takes values in M/Z, where p can vary from line to line or even within the same 
line. This is useful for handling expressions whose exact value is not important for the 
argument, but whose functional dependencies on other variables needs to be recorded. 

We shall give an argument which may seem a bit cumbersome (and is certainly not the 
shortest proof of this lemma), but it will serve to motivate the proof of Theorems 12.31 
and EH Indeed, this lemma can be thought of in some sense as the y = 1 case of those 
theorems. First observe that if 0 is a quadratic phase function, then {h-'Vx)4> is a linear 
phase function for each h. Thus for each h E G there exists fh^G such that 

(h ■ = (f){x -\- h) — 0(x) = fh' X -\- p(/i) for all x,h E G. (3.1) 

The next step is to obtain some linearity on the map h i—^ by using difference 

operators to eliminate various terms. Let k E Ghe arbitrary. If we apply the difference 
operator {k ■ Vx) to (I3.1|l we can eliminate the p(h) term to obtain 

{k ■ V)(j){x -\- h) — {k ■ V)0(x) = ih' k for all x,h,k E G. (3.2) 

If we then apply the difference operator (hi-Vh) some hi G G to eliminate the {k-'V)4>{x) 
term, we obtain 

(hi ■ V)(/c ■ V)(f){x -E h) = (hi ■ Vh)ih ■ k for all x, h, k, hi E G. (3.3) 

Making the substitution y = x -E h we obtain 

(hi ■ V)(h ■ V)(j){y) = (hi ■ Vh)ih ■ k for all y, h, k, h, E G. (3.4) 

If we then apply the difference operator (h 2 ■ Vh) for some h 2 E G to eliminate the 
remaining 0 term, we conclude 

0 = (ha ■ Vh)(hi ■ Vhj^h ■ k for all y, h, k, hi G G. (3.5) 

Since k is arbitrary, we conclude that 

{^2 ^ 0 for all h,. h 2 e G. 


(3.6) 
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Thus if we write 

e,. = 2Mh + eo (3.7) 

then we see that M : G i—> G is a group homomorphism. Note that we can insert the 2 
in front of M because |G| is odd; this factor of 2 will be convenient later. Inserting this 
back into eu we obtain 

{h ■ Va;)0(x) = 2Mh • a: + ^0 • 3; + p(h) for all x,h & G. (3.8) 

This is almost what we want, but we must somehow “integrate” the partial derivative 
h ■ Va;. To do this we must hrst establish that M is self-adjoint. Informally, this self¬ 
adjointness reflects the symmetry {h' ■ V)(h ■ V)0 = {h ■ V)(h' ■ V)0 of the second 
derivative. To make this rigorous (and in a manner which will extend suitably to more 
general situations) we shall use the following “symmetry argument”. In order to focus 
on the Mh ■ X term, we shall write (UHl) as 

p(a; + h) + p(a;) -|- p{h) — 2Mh ■ a; = 0 for all x,h e G. (3.9) 

Substituting x hj y and subtracting to eliminate p(h), we obtain 

p{y + h) + p{y) + p{x + K) + p(a;) — 2Mh ■ {y — x) = 0 for all x,y,h E G. (3.10) 

Making the substitution z = x + y + h, we conclude 

p(a:, z) -|- p{y, z) — 2M{z — x — y) ■ {y — x) = 0 for all x,y,z E G. (3.11) 

Absorbing as many terms as possible into the unspecihed functions p, we conclude that 

p{x, z) + p{y, z) + 2{x, ?/} = 0 for all x,y, z E G, (3.12) 

where {x,y} is the anti-symmetric form 

{x,y}-.= Mx ■ y — My ■ X. (3.13) 

Freezing the value of z, we conclude that 

p(x) -I- p{y) + 2{x, I/} = 0 for all x,y E G. (3.14) 

Replacing y by y' and subtracting to eliminate the p(x) factor, we conclude 

piy) + p{y) + 2{x, y -y} = 0 for all x, y, y E G. (3.15) 

Thus for each y, y', the linear function x i—*• {x, y' — y} is independent of x and is hence 

always zero®: 

{x, y' — y} = 0 for all x, y, y' E G. (3.16) 

Thus M is self-adjoint. 

We return now to (EHD, which we write as 

p{h) + p(a; + h) — 0(x) — 2Mh • a; = 0 for all x,h E G. (3.17) 

From the self-adjointness of M we have 

2Mh ■ X = M{x h) ■ {x E- h) — Mx ■ x — Mh ■ h (3.18) 

®There appear to be some intriguing parallels with symplectic geometry here. Roughly speaking, 

the vanishing is an assertion that the graph {{h,Mh) : h G G} is a “Lagrangian manifold” on 

the “phase space” G x G. This graph can also be interpreted (essentially) as the “wave front set” 
{(x, V(/)(a:)) : x G G} of the original function A similar interpretation persists in the proofs of 

Theorem O and o below. Thus we see hints of some kind of “combinatorial symplectic geometry” 
emerging, though we do not see how to develop these possible connections further. 
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and hence 

p{h) + p(x + h) — 4>{x) + Mx • T = 0 for all x,h & G. (3.19) 

In particnlar, if we apply difference operators in the h and x + h variables we see that 
the phase fnnction —(f>{x) + Mx ■ x is linear: 

(^1 ■ Va:)(h 2 ■ Vx){—4>{x) + Mx ■ x) = 0 for all x, hi, h 2 G G. (3.20) 

Hence there exists ^ G G and a c G M/Z snch that 

4){x) — Mx ■ X = ^ ■ X + c. (3.21) 

The claim follows. □ 


As one corollary of this classihcation, we obtain the following “qnadratic extension 
theorem”. 


Proposition 3.2 (Qnadratic Extension Theorem). Let G he an additive group, let 
H ^ G be a subgroup, and suppose that y E G. Then any quadratic phase function 
(j) : y + H —M/Z can be extended {non-uniquely in general) to a globally quadratic 
phase function on G. 

Remark. An important theme of this paper is that this behavionr is specihc to cosets 
y + H, and breaks down for other sets snch as Bohr sets. 


As a conseqnence of Proposition 13.2L let ns now establish the (easy) second part of 
Theorem 12.31 The hrst ineqnality ||/||t/3(G') ^ ||/|U3 (g') follows from ()2.2j) . so we focns 
on the second ineqnality ||/||u3(c) ^ ^-^WfWu^iy+w)- If snffices to show that 

II/L3(g) ^ ® |E^e?;+w/(a:)e(-0(x))| 

whenever 0 is a locally qnadratic fnnction on y + W. Bnt by the preceding discnssion, 
we can extend (j) to all of G, and write 


|G| 


Exey+wf{x)e{-(j){x)) 


Ex(.Gf{x)lw{x - y)e{-(j){x))\. 


Bnt by Fonrier inversion we may write 


-y) = E^g,y±e(^ ■ (x - y)) 

where W-^ := {^ G G : ^ • x = 0 for all x G hP}. Thns we have 

ExeGf{x)lw{x - y)e{-(j){x)) = E^^w±ExeGf{x)e{f ■ {x - y) - 0(x)). 
Bnt since f ■ {x — y) — 0(x) is globally qnadratic in x, we have 


\^xeGf{x)e{f -{x-y)- 0(x))| ^ ||/|U3 (g) 

and the claim follows from the triangle ineqnality. Note that this argnment in fact works 
for arbitrary gronps G (with W now being a snbgronp of G rather than a snbspace). 
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4. Averaging lemmas 

In this section we collect some very simple averaging estimates which we shall rely 
frequently on in the sequel. 

Lemma 4.1 (Averaging on a subgroup). Let G be an additive group, let H he a finite 
subgroup of G, and let A H be non-empty. Let f : H ^ C be a function. Then 

In particular, by the pigeonhole principle there exists x E H such that 

|Ej;ex+A/(i/)| ^ \Ey(.Hfiy)\- 
Proof. Since H + h = H for all h G A we have 

'^x&Hfi.x + h) = W^y^ufiy) for all h E A. 

Averaging this over all h G A we obtain the first claim, and the second claim then 
follows from the pigeonhole principle. □ 

This Lemma will be adequate for our purposes when we are in the finite field geometry 
case, because we will have plenty of subgroups available. In the general group case, 
however, we will also need a more general type of averaging principle. The next lemma 
contains several rather similar formulations of such a result; all of them will be useful 
later on. 

Lemma 4.2 (Averaging on a Bohr set). Let S C G be a set of d characters and let 
0 < p < 1 and e ^ l/200(i be parameters. Suppose that the Bohr set B := B{S,p) is 
regular, and let A C B{S,ep) be any set. Finally, let f : G ^ V be any function. Then 

(i) Ex(zBf{x)-E x(zBf{x)\ ^ 200(ie if y E A; 

(ii) Ex(zBf{x) - Ex(zBEy(zx+Af{x)\ < 200cie; 

(hi) There is some x E B for which Ey^x+Af iy) ^ "^y^sfiy) — 200(ie; 

(iv) There is some x E B{S,{1 — e)p) such that Ey^x+Af{y) ^ ^yenfiy) — SOOde. 

Proof. To prove (i) we must check that 

I fix)-^f{x)\ ^200de\B\. 

x^B-\-y x^B 

This follows from the fact that B{S,p) and y-\-B{S,p) differ in at most 200de elements. 
Indeed it is easy to see that 

B{S, p)A{y + B{S, p)) C B{S, (1 + e)p) \ B{S, (1 - e)p) 

(A denotes symmetric difference), and this latter set has size no more than 200de by 
regularity. 

(ii) follows from (i) and the triangle inequality. Indeed \Ex£Bfix + y) — Ex£Bfix)\ ^ 
200de, whence 

\Ex(.B^y(.Afix + y)- Ex(.Bfix)\ ^ 200 ^ 5 , 

which is (ii). 

(iii) is immediate from (ii) and the pigeonhole principle. 

(iv) follows from the pigeonhole principle and the fact that 

^xGsfix) '^x£B{S,{l—e)p)^y£x+Afiy)\ ^ SOOd^, 
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which is in turn implied by (ii) and the bound 

\E^^bF{x) -E^^B(s,(i-e)p)F{x) \ ^ 300ed, 
valid for any F : G ^ V. To confirm this, note that 


^F(x)- F{x)\^\B\B{S,{l-e)p)\^100ed\B\. 

xeB x£B{S,{1-£)p) 


Therefore 


Ex£bB{^x^ I ^ 


< 


1 


E - E ^1“=) 

xGB x£B{S,{1—£)p) 

1 1 


lOOed + 


lOOed + 


B\ \B{S,{l-e)p)\ 
\B\ 


Fix) 

x£B{S,(1 — £)p) 


\B{S,{l-e)p)\ 
1 


1 - lOOed 


1 < SOOed. 


This completes the proof of Lemma 14.21 □ 

We adopt the following useful notation, analogous to the p notation used in prov¬ 
ing T^emma 1,3.11 When g{xi,... ,Xn) is a complex-valued function of certain variables 
xi,... ,Xn with lls'lloo ^ 1, we shall refer to g{xi, ..., Xn) instead as b(a;i,...,x„). The 
notation b thus denotes a function with ||b||oo ^ 1, but the notation may refer to dif¬ 
ferent functions from line to line, or even on the same line (similar to the O notation, 
or the use of the unspecified constants C). 

We now record a basic application of the Cauchy-Schwarz inequality, whose proof is 
immediate. 

Lemma 4.3 (Cauchy-Schwarz). Let X,Y be finite sets, and let f : X x Y C be a 
function. Then for any bounded function b(a;) of X, we have 

\Ex,yf{x,y)h{x)\ ^ E^|Ej^/(a;,i/)| ^ {Ex\Eyf{x,y)\^YF = {Ex,y,pf{x,y')f{x,y)YF_ 

In the special case when Y = G is a group, we conclude in particular the Van der 
Corput inequality 

\^x&x,y&Gfix,y)h{x)\ ^ \E,,^x,y,h&GTyf{x,y)f{x,y)\^F ( 41 ) 

whenever f : XxG —>■ Cis & function, which follows by using the substitution y' = y+h. 
This inequality is very useful for eliminating unknown bounded functions b(a;) in an 
expression to be estimated. Using the van der Corput inequality, we can now prove 
Proposition 11.71 
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Proof of Proposition o It suffices to prove the more general statement^ 

(4-2) 

i6J 

for all finite sets J with |J| ^ 1, all jo G J, all bounded functions {fj)j£j, and all 
distinct integers {aj)j^j such that aj — ap is coprime to IG*! for all distinct j, j' G J. 


We induct on \J\. When |J| = 1 the claim is trivial from P-ljl . so suppose |J| ^2 
and the claim has already been proven for smaller values of J. Let ji be an element in 
J\{jo}- By making the change of variables x ^ x + aj^h if necessary we may assume 
that = 0. Since fj-^ is bounded, we can then express the left-hand side of (Q as 


|E,A6G(b(i) n r“''*/j(n)|. 

which by dHH) can be bounded by 

EteG(E.,k6G( n 

jeAlii} 


Applying the inductive hypothesis dOl) to the inner expectation, we can bound this in 
turn by 

EKG(l|r“'yj7'llGl^l-(G))‘''". 

which by Holder’s inequality and the substitution h := Ojk (noting that (iV, Oj — = 

(A^, ttj) = 1) is bounded by 


E 


h&G 




2 NI -2 |n1/2I-^I-1 


The claim then follows from (ED). 


□ 


Let us now consider averages of the form 

^zeB',xeB {f{z)h{x)h{z + x)) 

where B,B' are non-empty subsets of an additive group G, and f^g^h are bounded 
functions. B = B' = G, then a variant of Proposition II .71 shows that this quantity is 
bounded by ||/i|; 72 (G), and hence (by Proposition I2.2|l if the above average is large, then 
/ must correlate with a linear phase function. It turns out that a similar statement is 
true for arbitrary B, B', provided that B + B' is only a little bit larger than B. 

Lemma 4.4 (Large trilinear form implies correlation with linear phase). Let B, B' he 
two non-empty subsets of an additive group G. Then we have 

WfWu^B') > ^ |E^sB'.xei?(/(^)bi(T)b2(z + x))! 

for any f : G —>■ C and any two bounded functions bi, b 2 . 

^Indeed, the U‘^{G) norms are capable of controlling even more general expressions, for instance when 
the linear shifts ajh are replaced by polynomial shifts with no constant coefficient; this is implicitly in 
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Proof. Without loss of generality we may assume that /, bi,b 2 vanish outside of B', 
B, B + B' respectively. From Fourier expansion we have 

^z(iB',x&B{f{z)hi{x)h2{z + x)) = ^^Y^;y^^®^J/6G(/*bi(|/)b2(|/)) 

^ E(1bOE(1b) 

i&G 

On the other hand, from Plancherel we have |bi(OP ^ IE(1 h) and |b2(—OP ^ 

K{1b+b')- From Holder’s inequality we thus conclude that 

|E^eB'.xeR(/(0bi(T)b2(^ + O)l < 

and hence there exists ^ G G such that 

\^zeB'{f{z)e{-^ ■ z))\ ^ \E,^B',xeB{f{z)hi{x)h2{z + x))\-^^j^^^. 

The claim follows. □ 


5. An ARGUMENT OF GOWERS 

We now begin the proofs of the inverse theorems in Theorem 12.81 and Theorem o 
As we shall see, the arguments shall be analogous to those used to prove Lemma EUl 
and closely follow the treatment in Gowers |2S|. The hrst part of the argument is to 
establish a “phase derivative” h —>■ for the function / and to establish some additivity 
properties on this phase derivative. These arguments apply to arbitrary finite additive 
groups G; later on we shall treat the hnite held case separately. 

Proposition 5.1 (Large norm gives many additive quadruples |2S1)- Let G be an 
arbitrary finite additive group, and let f : G ^ V be a bounded function such that 
ll/llu3(G) ^ h some T] > 0. Then there exists a set H G, and a function f ■. H ^ G 
whose graph F := {{h,^h) '■ h G H} CGxG obeys the estimate 

|{(zi, 2 : 2 , 2 : 3 , 2 : 4 ) G F : zi + Z 2 = Z 3 + 2 : 4 }! ^ (5-1) 

Furthermore for each {h,^h) G F ice have 

\E,T^f{xmx)e{-^,-x)\^py2^/fi 

Proof. As we shall see, this proposition corresponds fairly closely with the hrst part of 
the proof of Lemma 18.11 (up to m)- From (HH) we have 

E.eG||TV7rrr^(G)^^®- 

Applying Proposition 12.21 we conclude that 

E.eGi|rV7lP.(G) ^ v^- 

Thus if we let 

H-.= {heG-. ||tV7||^2(g) ^ hV2} 

then we have 

E.eG||TV7lP.(G)lG\H(h) ^ r/V2 
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and hence 

^ ^V2. 

In particnlar we have 

E{1h) ^ vV‘2- (5.2) 

By ()2.5j) and dehnition of H, we can hnd a map h ^ from H to G snch that 

\E,T^f{x)J{x)e{-^h • x)| ^ 7^72'/' for all h e H. (5.3) 

Let ns £x this map h i—>■ ^h- We sqnare snm the above expression in h and nse ()5.2|) to 
conclnde 

EH\E^T'^f{x)]{x)e{-ih ■ x)\HH{h) ^ 

Bnt from the identity 

|E.TV(x)7(x)e(-a ■ x)|2 = E,,uT\T^f){x)T^f{x)T^f{x)e{ih ■ k) 

we conclnde 

\E,,h,kT\T^f){x)TV{x)e{i^ ■ k)lH{h)T^f{x)\ ^ (5.4) 

At this point we snppress the explicit mention of the fnnctions / and write this simply 
as 

\E^,h,kGG{H^ + h, k)lH{h)e{^h ■ k)h{x, k))\ ^ r]^^/4:. 

Applying (jHU to eliminate the b(a;, k) factor, we conclnde 

Eh,hux,kH^ + h + hi, k)h{x + h, k)e{{hi ■ Vh)ih ■ k)lH{h + hi)lH{h) ^ /IQ. (5.5) 
Making the snbstitntion y = x + h, we obtain 

Eft,y,hi,fce((hi ■ yh)ih ■ + hi)lH{h)h{y, hi, k) ^ /IQ. (5.6) 

Applying (jHU again we conclnde 

^hMM,y,k^iik2 ■ V/i)(hi ■ Vh)ih ' k)lH{h + hi + h2)tH{h + h2)l_H-(h + hi)lH{h) ^ 2 

(5.7) 

Snmming this in k nsing the Fonrier inversion formnla, and discarding the irrelevant y 
averaging, we infer 

^hM,h2^{h2-^h){hi-^h)ih=o^H{h + hi + h2)lH(^ + ^2)1 h(^ + ^i)Ih(^) ^ ?7®^/256. (5.8) 
The claim now follows by snbstitnting zi := {h,^h), ^2 := (h + hi,^h+hi), ^3 := (h + 
h 2 ,^h+h 2 ), and 2:4 := (h + hi + h 2 ,^h+hi+h 2 )- □ 

As remarked, the analogy between this argnment and the hrst part of the proof of 
Lemma o is very close. In particnlar eqnations (lEl, (O, (lESI), (jSSI), (EH) and (Ol) 
are analognes of (jSII), (HI, (HSl), (D, (HED and (USD respectively. 

In order to exploit the conclusion dnD, we require two very useful results from additive 
combinatorics. The hrst result asserts that a set T' with some partial additive structure 
(in the sense that it contains many additive quadruples) can be rehned to have a more 
complete additive structure (in the sense that its sum set is small). This type of result 
was hrst obtained by Balog and Szemeredi P] , but with very weak constants. A version 
of the theorem with polynomial dependencies between the constants was obtained by 
Gowers j2Sj. The version we quote below, with rather good powers in those polynomials, 
may be proved by a careful working of the argument in Chang ^2]. Of course for the 
purposes of this paper the precise values of the constants are somewhat unimportant. 
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Write r' — r' := {z — z' : z,z' G F'} for the difference set of F'. 

Theorem 5.2 (Balog-SzemerMi-Gowers theorem fBl)- LetG he an additive group, and 
let r be a finite non-empty subset of G sueh that 

^2, 2^3, ^4) G F : Zi Z2 = Z3 Z4}\ ^ |F |^/K 

for some K ^ 1. Then there exists a subset F' C F such that 

|F'| ^ 2-^K-^\T\ and |F' - F'| ^ 2'^^K^\T'\. 

The other tool we need is the Pliinnecke ineqnality, a simple proof® of which can be 
fonnd in [HK] . 

Theorem 5.3 (Pliinnecke inequalities |5()1155j i. Let G be an arbitrary additive group, 
and let P' he a finite non-empty subset of G such that |F' —P'| ^ iF|P'| for some K ^ 1. 
Then we have |fcF' —/F'| ^ iF^+^lP'l for all k,l ^ 1, where kV denotes the k-fold sumset 
ofT'. 

In our applications the integers k, I will be quite small, in fact they will not exceed 9. 
Combining Proposition 15.II with Theorem 15.21 and Theorem 15.31 we conclude 

Proposition 5.4 [h 1 —^ fh is nearly affine-linear). Let G be an arbitrary finite additive 
group, and let f : G ^ V be a hounded function such that 11/11(73(0) ^ p for some rj > 0. 
Then there exists a set H' C G a function f : H' ^ G whose graph F' := {{h,fh) '■ h G 
H'} C G X G obeys the estimates 

|F'| ^ 2-^V^iV (5.9) 

and 

\kT' - /F'l ^ for all kfi^l. 

Furthermore for each {h, fh) G F' we have 

\E,,^GiT^fix)mei-fh • t))| ^ r7V2. (5.10) 

Thus far we have not used anything about the underlying group G other than it is 
hnite and abelian. We could continue doing this, proving Theorem 12.71 and extracting 
Theorem o as a corollary. However for expository reasons we will now restrict to 
the hnite held geometry case G = F 5 and give a complete proof of Theorem 12 ., f I The 
arguments there serve as a simplihed model which will help motivate the general case. 

6. The einite field case 

We now restrict attention to a hnite held geometry setting G = F 5 , and prove Theorem 
12.31 (i). In this section, then, N = b"’. Thanks to Proposition 15.41 we have already 
isolated a phase derivative h ^ fh which exhibits some linear behavior. The hrst step 
(following Gowers [231211) is to show that f in fact matches up with a linear phase 
function on a large subspace; then we shall show that in fact f matches up with a 
self-adjoint linear phase function on a large subspace (this is the substantially new part 

^Actually, since we only need this theorem for bounded k, I, and would not be concerned if the bound 
of were worsened to for some absolute constant C, it is possible to modify the proof of 

Theorem 15. 2l in order to gain control on |fcr' — ff'| directly, without needing the Pliinnecke inequalities. 
However this would not simplify the remainder of the argument and so we do not give the details of 
this alternate approach here. 
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of the argument). Finally, we shall again follow Gowers j2S] and conjugate / by a 
quadratic phase to eliminate this linear phase derivative and conclude the argument. 

Step 1: Linearization of phase derivative. In this subsection we establish 

Proposition 6.1 (Graphs have large linear component |2S1I2Z1)- -het H' C Fg, and let 
^ : H' ^ (IF 5 )* be a function whose graph F' := {{h,^h) : h G H'} C Fg x (F^)* obeys 
the estimates 

K-^N ^\T'\ ^\9T'-8T'\ ^ KN (6.1) 

for some K ^ 1. Then there exists a linear subspace G ^ Fg with the codimension 
bound 

n - dim(I/) ^ {2Kf 

and a translate Xq + V of this subspace, together with a linear transformation M : G —>• 
(F 5 )* and an element fo G (F^)* such that 

^hevi^nixo + h)l^^^^,^=2Mh+io) ^ ( 2 iF)“'‘. ( 6 . 2 ) 

Proof. Note that while F' is a graph, the slightly larger sets kT' — IT' need not be a 
graph. The next lemma shows that this can be rectihed by passing to an appropriate 
subset F" C F'. 

Lemma 6.2. There exists a subset F" o/F' with 

|F"| ^ N/5K^ (6.3) 

such that 4F" — 4F" is a graph. 

Remarks. When G = Z/A^Z is a cyclic group of prime order, this lemma is essentially 
m Lemma 7.5], and our arguments both here and in l|niare in a similar spirit. Guriously, 
the argument of uses the fact that 'Ll NT, is a held, and so this is a rare instance of 
the cyclic group case being somewhat easier than the general group case. The conclusion 
can also be rephrased as an assertion that the map h 1 —is a Freiman 8-homomorphism 
on H". 

Proof. Let A C (Fg)* be the set of all f such that (0,.^) G 8 F' — 8 F'. Observe that 
since F' is a graph, we have |F' + 74 | = |F'|| 74 |. On the other hand, F' + A is contained in 
9F' — 8 F'. Applying (jb.lj) . we conclude that |A| ^ K^. Now let m = [logg |A|], and let 
4/ ; (Fg)* —>■ F™ be a randomly chosen linear transformation from (Fg)* to F'”. Observe 
that for each non-zero G A, we have 4/(,^) 7 ^ 0 with probability at least 1 — 5“"^. Thus 
we see that with non-zero probability T is non-zero on all of A\{ 0 }. 

Fix T with the above properties, and let c be a randomly selected point in F”*, and 
dehne F" := {{h,^^) ^ h' : 4/(,^/j) = c}. Then the expected size of |F"| is at least 
|F'|/5™' ^ N/bK"^. Also, observe that if (0,(^) lies in 8 F" — 8 F" then ^ lies in A and (by 
linearity of T) T(,^) = 0, hence ^ = 0. Since 8 F" — 8 F" is the difference set of 4F" — 4F", 
this implies that 4F" — 4F" is a graph. The claim follows. □ 

Dehne H" by F" := {{h,^h) : h G H''}. Now we use a result of Bogolyubov [7j (which 
we shall also utilize later in this paper) to obtain some control on the set 2H" — 2H". 

Lemma 6.3 (Bogolyubov lemma [7]). Let A be a subset of a finite additive group 
G such that |A| ^ 5|G|. Then there exists a set S G with 151 ^ 26~^ such that 
B{S,l) C 2 A-2A. 
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Remark. Somewhat sharper versions of this lemma are known, see ^2] , bnt we will not 
need these here. 

Proof. From Fonrier inversion we have 

«6G 

Convolving this with itself fonr times, we obtain 

1a * 1a * 1-A * l-A(a^) = ^ |1 a(0I'^6('C ■ 

«6G 

where / * g{x) := E,yf{y)g{x — y) is the normalized convolntion operation. Since the 
left-hand side is only non-zero on 2A — 2A, we conclnde that 

{a: e G : 3? ^ |Ta( 0 l^e(^ • x) > 0} C 2kl - 2A. 

«6G 

Let ns now stndy the snm on the left-hand side. Let 0 < a < 5 be a parameter to be 
chosen later, and let S := {^ G G : |1a(OI ^ denote the large Fonrier coefficients of 
A. Since a < E(l^), we have 0 G S'. Also, from the Plancherel identity 

C6G 

and Chebyshev’s ineqnality we have an npper bonnd on the cardinality of S: 

|S| ^ 6/a^. 

Now snppose that x G i?(S, 1). Then we have cos(27r^ • x) ^ 0 for all ^ G S. Tims 
^ ■ ^) = cos( 27 r^ ■ ^) + I 1a(0 1^ cos( 27 ri^ ■ x) 

5eG ies 

?6G 

by another application of Plancherel’s theorem. Tims if we set a := (5^/^/a/ 2 (for 
instance), the claim follows. □ 

We apply Lemma fh.dl with G = Fg and A = H" , the set coming from Lemma fh.21 The 
set S we prodnce satisfies |S| ^ 50iF®. Let 1/ C be the snbspace Id := {x G Fg"'"™' : 
X ■ = 0 for all G S}. Then by linear algebra we have 

dim(f/) ^n-\S\ ^ n - 50iF® 

and, since clearly V C i?(S, 1), we have 

V C 2H" - 2H". 

Thus there exists a map M : V ^ G such that 

Z := {{h,2Mh) : h e V} 


(6.4) 
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is a subset of 2T" — 2r"; since 2r" — 2r" is a graph and contains the origin, we have 
MO = 0. Also, since 4r" — 4r" is a graph and V is closed under addition, we see that 
M(h + h') = Mh + Mh' for all h, h' G V, thus M is linear. 

Consider the set Z + F". On one hand, this set can be foliated into (say) L disjoint 
cosets of the linear space Z. On the other hand, it is contained in SF" — 2F", and 

|3F" - 2F"| ^ |9F' - 8F'| ^ KN. 

This gives the bound \Z\ ^ KN/L. Since Z + F" also contains F", we thus see from the 
pigeonhole principle and (El that there exists a coset (xo,^o) + Z of Z such that 

|r"n((xo,eo) + ^)| ^ |^|/5iF^ 

Since dim(Z) = dim(l/) ^ n — SOiF®, the claim (j0.2j) follows. □ 

Combining Proposition lti.1l with Proposition 15.41 we quickly conclude the following 
proposition. 

Proposition 6.4 (Large [/^(Fg) gives linear phase derivative). Let G = Fg, and let 
f : G ^ V he a bounded function such that ||/||;73(g') ^ rj for some rj > 0. Then there 
exists a linear subspace V of G with the codimension bound 

n — dim(l/) ^ (6.5) 

and a translate Xq + V of this subspace, together with a linear transformation M : V ^ G 
and an element E G such that 

E,ey|E,6GT"°+V(a:)7Me(-(2Mh + ^o) • x)| ^ 2-^^p^'^. (6.6) 

It is permissible to take Gi, G[ = 2^® for i = 1, 2. 

Step 2: The symmetry argument. We now establish some symmetry properties on M, 
closely following the argument in Lemma fd. 11 As we shall be focusing more on M than 
on / in this step, we shall suppress the terms involving / (and fo and xq) using the b 


notation. Indeed from ()6.6jl we have 

|Ea:eG,/ievb(a; + h)h{x)h{h)e{-2Mh ■ x)\ 2~^'^rf'‘^ (6.7) 

Once again we use Cauchy-Schwarz and similar tools to eliminate all the bounded 
functions. By Lemma there exists xi E G such that 

|IEa;,hevb(a; + Xi + h)h{x + a;i)b(h)e(—2Mh ■ {x + a;i))| ^ 2~^^r]^^, (6.8) 

which after redehning the bounded functions to absorb the Xi terms implies that 

|IEa:,hevb(x + h)h{x)h{h)e{—2Mh ■ x)| ^ 2~’^'^r]^K (6.9) 

Applying Cauchy-Schwarz (Lemma EHI) to eliminate h{h), we see that 

\^x,y,h£vHy + h)h{y + h)h{x + h)h{x)e{-2Mh ■ {y - x))\ ^ (6.10) 

Making the substitution z = x + y + h, this becomes 

\E,^,y,z&vH^,y)H^,x)e{-2M{z - X - y) ■ {y - x))\ ^ 

Absorbing as many phase terms into the functions b as we can, we infer 

\K,y,zevH^-,y)bO-,x)e(2{x,y})\ J 


( 6 . 12 ) 
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where {x,y} is the anti-symmetric form dehned in (Id. 1 dll , that is to say {x,y} = Mx ■ 
y — My ■ X. By the pigeonhole principle in 2 ; we conclnde that 

|Ex,j/evb(?/)b(x)e(2{x,?/})| ^ 

for some bonnded fnnctions h{y), b(x). Applying Canchy-Schwarz again to eliminate 
the b(a:) factor, we dednce that 

\E^,y,y'evHy)Hy)e{2{x,y -y})\^ 

By the triangle ineqnality we obtain 

Ey^y>(.v\^xevei2{x,y' -y})\ ^ (6.15) 

Making the snbstitntion h = y' — y we conclnde 

Eh^v\^xeve{2{x,h})\ ^ ('g^^g^ 


The map x 1 —> 2{x, h} is a homomorphism from V to M/Z. Thns if we write 

W:= {heV : {x,h} = 0 for all x e V} 
then IT is a linear snbspace of V and 

E^^ve{2{x, h}) = lw{h). 

Thns, in view of fj6.16|) . we see that IT is extremely large relative to V: 

jWj/jVj=E,,/,^vlw(h)>2-^^^y^^i. 

In particnlar, from (16.5|) we have 

n — dim(lT) ^ 2^^ri~^'^ + AC 2 \og^{2/ri) + AC 2 logg 2 ^ 

By constrnction of IT we see that M is self-adjoint on IT, that is to say 

Mw ■ w' = Mw' ■ w for all w, w' G IT. (6-17) 

Let ns remark that the analogy between this argnment and that of ^ is exceptionally 
close. The seven eqnations ()d.9|l . (jd.lOjl . (jd.lUl . ()d.l2|l . ()d.l4jl . ()d.l5jl and ()d.l6jl are 
analognes of (j^ . ()6.10|) . (jb.lljl . ()6.12j) . ()6.1dj) . ()6.14p and ()6.15j) respectively. 

Step 3: Eliminating the quadratic phase component. We now give the hnal part of the 
proof of Theorem 12.dl which also follows the proof of Lemma Id. II closely. We retnrn to 
()6.6jl . which we write as 

|Ea:eG;hevb(h)b(x + h)f{x)e{-2Mh ■ x)\'^ 2~^^r]^^, 

where we have distribnted the phase factor e{—^o-x) = e{—^o-{x + h))e{^o-h) among the 
fnnctions b. Here the focns will be on the f{x) factor, the aim being to demonstrate that 
this fnnction exhibits some qnadratic bias. We hrst observe that the simple averaging 
argnment of Lemma 14.11 allows ns to hnd h' G T snch that 

\ExeG-,h£w'^{h + h')h{x + h + h')f{x)e{-2M{h + h') ■ x)\ ^ 2~^'^r]^^. 

Once again we can absorb e{—2Mh' ■ x) into the fnnctions b, and conclnde that 

\ExeG-MwHh)H^ + h)f{x)e{-2Mh ■x)\^ 
which implies that 

\Ey^G-,xMwHh)H^ + h + y)f{x -f y)e{-2Mh ■ (a; -7 i/))| ^ 2^^r]~^^.; 
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Using the triangle ineqnality we dednce 

¥.y^G\^^,h&w'^{h,y)h{x + h,y)f{x + y)e{-2Mh ■ x)\ (6.18) 

Now we observe from (iniiD that we have the identity 

2Mh ■ X = M{x + h) ■ {x + h) — Mx ■ x — Mh ■ h, 


and hence 

e{—2Mh ■ x) = b(a; + h)h{h)e{Mx ■ x). 

Therefore (iniHD implies that 

Ey6G|Ex,fcewb(h, ?/)b(x + h,y)f{x + y)e{-Mx ■ x)\ ^ 

Applying Lemma 14.41 for each y & G separately and with B = B' = W, we conclnde 
that 

^y&G\\f{x + y)e{Mx ■ x)\\u'i{w) > 

and hence of course 

Ey^cWfix + y)e{Mx ■ a:)||„3(vy) ^ 2~^‘^rj^\ 

But the u^iW) norm is invariant under quadratic phase modulations, conjugation, and 
translation, and so we have 


which gives (|21D. 


^^y&cWfWu^iy+W) ^ 2 


□ 


7. Application: Szemeredi’s theorem in finite field geometries 

As a sample application of Theorem 12., f I we can now prove a quantitative Szemeredi 
theorem for progressions of length four in F^. If G is any finite abelian group of order 
where (6,A^) = 1, we define r 4 (G) to be the cardinality of the largest set A C G 
which does not contain four distinct elements in arithmetic progression. 

Theorem 7.1 (Szemeredi theorem for r 4 (F 5 )). Write N = 5”. Then we have the bound 

r^{¥^) ^ N{\oglogN)-^~"\ 

Remark. In m below we will prove a similar result for an arbitrary G. Although that 
result will supersede the present one, the proof is quite a bit more complicated, and 
so we give the finite field argument separately now. In [33] we improve the bound to 
r 4 (F 5 ) -C 77(logiV)“'^ using substantially lengthier arguments. 

As with all analytic arguments for proving Szemeredi type theorems, the key step is the 
establishment of the following “density increment” result. 

Proposition 7.2. Let h > 0, suppose that 

n > e(2/sf\ (7.1) 

and let A be a set with size at least 6N. Suppose that A contains no four-term 

arithmetic progressions. Then we can find an affine sub space Xo-\-V o/Fg with dimension 
dim(U) ^ n/3 such that we have the density increment 

(x) + {5/2f\ 
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Proof. Write 


a := Ej; 6 F^lA(a;) = \A\/N. 

By Corollary 0 and the lower bound N ^ 2/5^ (which is very much a consequence of 
(EU !) we conclude that 

|| 1 a — tt||(73(Fg) ^ 5^/8. 

Applying Theorem 12.81 we can thus hnd a subspace hh ^ Fg with codimension at most 
{2/5Y^ and quadratic phase functions (py for each 1 / G Fg such that 

Ey^v^\E^^y+w{^A{x) - a)e{-(t)y{x))\ ^ {S/2f^. (7.2) 


On the other hand, we may assume that 


E^(.y+wi^Aix) - a) < \{5/2:f^ 


for all y G F5, since the proposition is immediate otherwise. Now since Ex^yj^wi)-A{,x) — 
a) has mean zero, we thus conclude that 

l®"a;6y+w(lA(3^) “«)| < 

Subtracting this from (in and using the pigeonhole principle, we infer that there exists 
1 / G Fg such that 

\E^^y+w{^A{x) - a)e{-(()y{x))\ > ^{6/2^^ + \E^^y+w{^A{x) - a)|. 

Now observe (using Lemma EIH) that cpy takes values in the set T = {0, |, |, |, |}. For 
each t G T let St be the quadratic surface St := {x E y + W : (j)y{x) = t}. Then by the 
triangle inequality we have 


E^^y+w{^A{x) - a)e{-(j)y{x))\ ^ ^ \E^^y+w{^A{x) - a)l5 t(T)| 

teT 


whilst 


E:j;^y+w{^A{x) - a)\^ -E^^y+w{^A{x) -a) = - ^Ea;6y+VF(1A(T) - a)l5t(T). 

teT 


Combining these estimates, and using the pigeonhole principle, we deduce that there 
exists t eT such that 


|®^xS3/+It(1a( 2^) ®)lSt(3^)| ^ ^xSy+wlSt (^) Ex^y-[-w{XA{x^ Q!) 15 ((t), 

and hence 

Ex(.y+w{^A{x) - a)lst(2:) ^ \{5/2f^Ex(.y+w^St{x). (7.3) 

This gives a density increment on a quite large quadratic hypersurface StP {y + W). 
Our job is now to convert this into a density increment on a subspace. Observe from 
Lemma o that we can write StA {y + W) in the form 

StAiy + W) = y + {x eW ■. \x ■ Mx + v ■ x = c} 

for some self-adjoint linear transformation M : W ^ W, some v G W*, and some c E F. 
We now locate a large subspace on which M is degenerate. To this end we need a simple 
lemma®. For future reference we shall phrase this lemma for more general hnite helds 
than F 5 . 

®One could also proceed here using the theory of Witt groups, but that would be far more advanced 
technology than what is actually needed here. 
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Lemma 7.3 (Gauss sum lemma). Let F be a finite field of odd characteristic, let W he 
a vector space over F, and let M : W ^ W be a self-adjoint linear transformation. If 
dim(iy) ^ 3, then there exists a non-zero x G W such that x ■ Mx = 0. 

Proof. If M has a non-trivial kernel then the claim is easy, so assume M has no 
kernel. Then a standard Gauss sum computation (using ()4.1j) . for instance) shows that 
\¥,„,^y/e{a{x ■ Mx))\^ = for all a G |F| \ {0}. But by Fourier inversion we 

have 

^ 5^E,e^e(a(a; • Mx)) ^ ^(1 - (|F| - 1) ■ \F\-‘^^(^y^). 

a£F 

Since dim(lF) ^ 3 we thus see that {x G IF : x ■ Mx = 0} must contain at least one 
non-zero element, and we are done. □ 


In our situation, the space IF has dimension substantially larger than 3 - in fact 
dim(lF) ^ n — (2/5)'^'". Let U he a. subspace of IF which is degenerate in the sense 
that X ■ My = 0 for all x,y E U, and which is maximal with respect to this property. 
We claim that dim(t/) ^ n/2 — (2/(5)^*^. Indeed if this were not the case the space 
f/-*- := {a: G IF : x ■ My = 0 for all ?/ G Uj would have at least three more dimensions 
than U (in fact, vastly more than this), and one could apply the previous lemma to 
/U to contradict the maximality of U. 


Let U be as above. By splitting (HI into cosets of U and applying the pigeonhole 
principle, we may find a coset z + U oiU m. y + W such that 

Ea,g^+c/(l^(a;) -a)lsfix) ^ 

Note that on this space z U, the form x ■ Mx becomes linear, which means that 
S'* n (z + U) is an affine subspace oi z + U oi codimension at most 1. Letting xq + V 
denote this subspace, and recalling that the constant C in Theorem 12.31 could certainly 
be taken to be 2 ^^, Proposition 17.21 follows. Note that (HU suffices to guarantee the 
stated bound dim(F) ^ n/3. □ 


Proof of Theorem Suppose that A C has cardinality SN yet contains no four- 
term arithmetic progression. Then by repeated application of Proposition 17.21 we may 
construct a sequence 

Fg ^ Xl -t- Fl ^ X2 + V 2 ^ ... 
of affine subspaces such that dim(V^) ^ n/3^ and 

E^^,,^+vAa{x)^6 + j{6/2)‘^"\ (7.4) 

provided only that at all stages the condition HU is satisfied, that is to say 


dim{Vj) > 6{2/6)^"\ (7.5) 

Equation (HI is impossible if j ^ jo = (2/5)^^°, and so (j7.5j) must be violated by some 
j ^ jo- This means that 


n > 6 X (2/h)2'° X 3 ( 2 /^)' 
which certainly implies that 6 -C (loglogiV)”^ 


□ 
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Remark. A more-or-less identical argument shows that 

r^{¥;) « iV(loglogiV/p)-^ 

uniformly for all primes p ^ 5, for some absolute constant c > 0. In combination with 
Gowers’ result that r 4 (Z/A^Z) A^(loglog this may be used to give a fairly cheap 
proof that r 4 {G) = o{N) for all finite abelian G, a result which was first obtained by 
Frankl and Rodl na by rather different means. The key to the argument is that G 
contains either a large subgroup of the form F^, or else a large cyclic subgroup. The 
bound obtained is of the form 

r 4 (G) -C A^(loglogloglog 

we suppress the details since a far superior bound will be obtained in HTTl 

8 . Some results on Bohr sets 

Let G be an arbitrary finite abelian group with |G| = A^, let S' C G will be a set 
of d characters, and suppose that p G (0,1) is a positive parameter. We will collect 
some basic facts about Bohr sets B{S,p) which we will need to prove Theorem 12.71 
These Bohr sets play the role that subspaces did in the finite geometry setting, with 
the quantity d corresponding, roughly speaking, to the codimension of the subspace. 

Note that a Bohr set always contains 0, and is symmetric around the origin. In fact we 
have the following easy bounds on the size of Bohr sets: 

Lemma 8.1 (Bounds for size of Bohr sets). We have 

\B{S,p)\^p^N 

and 

\B{S,2p)\^5^\B{S,p)\. 

Proof. By the triangle inequality, we see that for any y = in the torus (M/Z)"^, 

we have 

EWu-. -y^\k/z<P/2 = |{a: e G : sup ||^ ■ x - y^W^/z < p/2}\ 

xeG^eS 

^ |{x e G : sup 11^ • x||]R/z < p}| 

«6S 

= \B{S,p)\. 

Integrating this over all y G (M/Z)'^, we conclude that 

x£G 

which gives the first bound. 

To establish the second bound, we integrate the same expression but only over the cube 
{y : supgg _5 IIp^II ^ |p}. Note from the triangle inequality that this does not affect the 
components of the integral for which x E B{S, 2p). Thus we have 

x&B(S,2p) 

which gives the second bound. □ 
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Next, we establish, following Bourgain, that regular Bohr sets (as dehned in Dehnition 
12 .fill exist in abundance. 

Lemma 8.2 (Regular Bohr sets are ubiquitious HDD- Let 0 < e < 1. Then there exists 
p G [e:, 2e\ such that B{S,p) is regular. 

Proof. We may assume S is non-empty since the claim is trivial otherwise. Let 
/ : [0,1] —> M be the function /(a) := ^ log 2 15(5, 2“e) |. Observe that / is non¬ 
decreasing in a, and from T;emma f8.1l we have /(I) — /(O) ^ log 2 5. 


Suppose we could hud 0.1 ^ a ^ 0.9 is such that |/(a') — /(a)| ^ 20|a — a'\ for all 
|a| ^ 0.1. Then it is easy to see that the Bohr set B{S, 2“e) is regular. Thus, it suffices 
to obtain an a with this property. This can be done directly from the Hardy-Littlewood 
maximal inequality (applied to the Lebesgue-Stieltjes measure df), or as follows. If no 
such a exists, then for every a G [0.1, 0.9] there exists an interval I of length at most 
0.1 and with one endpoint equal to a, such that fjdf > Jj20 dx. These intervals cover 
[0.1, 0.9], which has measure 0.8. By the Vitali covering lemma^°, one can hnd thus hnd 
a hnite subcohection of disjoint intervals Ji,..., /„ of total length |Ji| -|- ... |J„| ^ 0.8/5 
(say). But then we have 


log,5^ 20 dx^^x20, 

2o Jh Jh ^ 


a contradiction. 


□ 


Next, we show that given any small set of points in G, one can hnd a large Bohr set 
which avoids ah of them except possibly for zero. We will need this to develop the 
analogue of Lemma IHISl (see Lemma in21 below). 

Lemma 8.3 (Separation lemma). Let G be a finite additive group, and let A O G be a 
set of elements containing zero. Then there exists a set S G with |5| ^ 1 -|- log 2 |R| 
such that An B{S,j) = {0}. 

Proof. We induct on lA/ When |y4| = 1 the claim is trivial (set 5 = 0). Now suppose 
that |y4| ^ 2 and the claim has already been proven for smaller sets A. Suppose that 
2" < |y4| ^ 2"+^. Let ^ G G be chosen randomly. Observe that for each x G y4\{0}, the 
map I—^ ■ X is a non-trivial group homomorphism from G to M/Z, thus the random 

variable ^ • x is uniformly distributed over a cyclic subgroup of M/Z. In particular, we 
have 

P(a:e-BK,i)) = P(||Oa:||«/z<i)<i 
Summing this over ah non-zero A, we conclude 

E|(2l\{0})n5(e,i)|^(|R|-l)/2. 

In particular, we can hnd G G such that 

|.4nBK,i)K [RLtij <2”. 

^^One can also use the Besicovitch covering lemma at this point, which would in fact give slightly 
better bounds. Indeed, one can improve the constant 1/5 to 1/2, see for instance |15| . 
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By induction hypothesis we conclude that there exists a set S' of cardinality at most n 
such that 




and the claim follows by setting S' := S" U 


□ 


Our next task is to investigate the Fourier-analytic behavior of Bohr sets. In the first 
instance we deal with a concept somewhat more general that that of a Fourier coefficient, 
replacing a linear phase function by a locally linear phase function. 

Lemma 8.4 (Generalized Fourier decay). Let S G, [S'! = d, be a set of characters. 
Let B := B{S, p) be a regular Bohr set, and let f : B{S, 2p) —> M/Z be a function which 
IS locally linear in the sense that that (j){x + y) = 4>{x) f>{y) whenever x,y G B{S,p). 

Suppose that 

|ExeB(e(((>(T)))| ^ 7 ] 

for some 0 < rj ^ 1 {large generalized Fourier coefficient). Then f is close to constant 
in the sense that for every y E B we have 

2^^d\\y\\s _ 

pf]'^ 

Proof. Let y E B, and let M ^ 0 be the largest integer such that M||i /||5 ^ pp/AOOd. 
If M = 0 then we have ||i/||5 ^ pp/AOOd, and the claim is trivial. By Lemma f4.21 fiil we 
have 

|Ej,6s(e((/)(a;))) - E,j6sE_M^n^Me(0(a; ny)))\ ^ p/2 
and hence by the triangle inequality 

|E3;6BE_M^n^Me((/)(x + ny))\ ^ p/2. 

On the other hand, by the local linearity of f we have 0(x -|- ny) = 4>{x) 4 - n(f){y) for 
all \n\ ^ M and hence e(0(x -|- ny)) = h{x)e{n(j){y)). By the triangle inequality we 
conclude that 

\^-M^n^Me{n(j){y))\ ^ p/2. 

But by the geometric series formula, the left-hand side is bounded by 2/M\\(j){y)\\M./i. 
This implies that 

n ,/ M, 4 3200d„ „ 

\\4>{y)\\R/z 

which implies the result. □ 



As a corollary we see that the normalized Fourier transform of a Bohr set decays away 
from the “polar body” of that Bohr set^^. If ^ G G then define 

||'C||b(S',p) := sup II^-i/IIr/z- 
y£B{S,p) 

Note that if G S' then ||.^||b(s,p) ^ p. 

^^One could obtain much better Fourier localization properties by replacing the Bohr sets by 
smoother weight functions; see for instance miEni for examples of this approach. This also con¬ 
veys the slight advantage that all weight functions can automatically be made regular. However these 
functions have the disadvantage of being spread out in physical space, and we found it more convenient 
to use Bourgain’s machinery of regular Bohr sets from nni instead. 
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Corollary 8.5 (Fourier decay). Let S C G be a set of d characters, let B := B{S,p) 
be a regular Bohr set, and let 0 < 9 ^ 1. Then for any ^ ^ G, we have 

|E^6Be(^ ■ a;)| ^ Q4:{-rr^ - 

Il4l|s(5,ep) 

where 

Proof. Apply Lemmawith 0(x) := f ■ x, with rj := |Ea;gse(^ ■ a;)| and with y being 
an arbitrary element of B{S, dr). □ 

We can exploit this decay via a Tomas-Stein almost-orthogonality type argument (also 
used by Bombieri [H] in the context of the large sieve; see also [IHl) to conclude 

Corollary 8.6 (Local Bessel inequality). Let S (L G be a set of d characters, let 
B := B{S, p) he a regular Bohr set, lett)<6^1, and let be frequencies 

such that ll^j — ^j||s( 5 , 6 »p) ^ d for all 1 ^ i < j ^ k and some 5 > 0. Then 

k 

E.6s|5^b(j)e(0(x))|2 ^ fc + 2V(0dA)'/2 
i=i 

for any hounded complex numbers b(j). 

Proof. We have 

k 

Exesl X]b(j)e(^j(a;))p = ^ h{i, - Q ■ x) 

j=l 

^k + 2 ^ |E^gBe(^i - fj) ■ x) I 

^ fc + 2V(0d/(5)^/^ 

thanks to Corollary 18.51 The claim follows. □ 

We can dualize the above corollary to give the following result. This allows us to 
generalize, to the relative setting, a frequently-used consequence of Parseval’s identity: 
a large set A<L G cannot have too many large Fourier coefficients. 

Corollary 8.7 (Local Bessel inequality, dual version). Let S G be a set of d charac¬ 
ters, let B := B{S,p) be a regular Bohr set, let0<6,p^ 1, and suppose that A<T B. 
Let 

F:={eeG: lUOl ^7E(ln)}. 

Then there exist frequencies fi,... ,fk ^ G with k ^ 2/p^ such that any ^ is close 
to some fi in the || • ||b(s,6»p) norm: 

r C e G : 11^ - ^j\\B{s,8p) ^ 2^^ed/p'^ for some 1 ^ j ^ k}. (8.1) 

Proof. Let 6 := 2^^6d/p^, and let be frequencies in F such that ||^i — 

^j\\B{s, 9 p) ^ d, and which is maximal with respect to set inclusion. Then it is clear that 
(EH) holds. For each 1 ^ j ^ A:, we have fj G F. Hence there exists a bounded complex 
number b(j) such that 

3ftE,j,gBb(j)e(^j(a;))lA(a:) ^ p. 
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Summing this in j and applying Cauchy-Schwarz, we conclude that 

k 

i=i 

Applying Corollary 18.til we conclude that ^ k + ^r]‘^k‘^, and hence k ^ 2 / 77 ^. The 
claim follows. □ 


As a consequence, we can now generalize Bogolyubov’s argument iTvemma lb.djl to subsets 
of Bohr sets. 

Lemma 8.8 (Local Bogolyubov lemma). Let S ^ G be a set of d characters, and let 
B := B{S,p). Let AC B be a set with |A| = 6\B\. Then there exists a set S' C G with 
l^'l ^ 2^5-3 such that B{S U S', 2-^^5^p/d) C2A- 2A. 


Proof. It is convenient to replace 2A — 2A by the slightly smaller set A + A' — A — A'. 
Let £ = h/400(i. By Lemma there exists p' G [ep,2e,p] such that the Bohr set 
B' := B{S, p') is regular. By Lemma f4.21 (hi) we can find x E B such that 

Eyg^+s/lA(2/) ^ Ey^B^Aiv) - 200de ^ 6/2. 

Let A' := A n (a; + B'). From the Fourier inversion formulae 

1 ^( 3 ;) = ^ lA(Oe(^ • x); Ia^x) = ^ lA'(Oe(^ • x) 
we conclude that 

1 a * 1a' * 1 -A * l-A'(^) = |lA(0niA'(0re(^ • x). (8.2) 

C 6 G 

In particular, applying (EH with a; = 0 we conclude that 


e.£g|u * I Ax)? = Y. IUK)nuK)t. 

« 6 G 

The function 1 a * 1a' is supported on B{S,p + p'), which has cardinality at most 2\B 
since B is regular and e < l/200(i. Thus by Cauchy-Schwarz 

iEa;eG|lA * 1 a'(t)| ^ (E^jg^lA * 1 a'(t)) /2E(lij) 

= E(1a)'E(1aOV2E(1b) 

= ^[Ey^B^Aiu)) (Eyea:+B'lA(7/)) E(1 b)E(1b/)^ 


^ (5^A:-^E(1s)E(1sOV8- 


Thus we have 

Z IUK)tlUK)P > p'‘E(ifl)E(iflO'=. 

? 6 G 


(8.3) 


Now let 


fl:=KeG:|p.(0 |>p='-'"E(lB,)}, 
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and let x G B{R, ^). Then by taking real parts of both sides of (I8.2I] . we conclnde 


1a * 1a' * 1-A * 1-A'(3^) 


= 5 ^|lA(OniA'(OPcos( 2 <-a:) 

« 6 G 

^ E iiA(oriiA'(orcos(2vr/io) - E ii^(oriiA'(op 

= 5]|L«)riGK)tcos(27r/10) 

« 6 G 

-(cos(27r/10) + 1)5^ |TK)tlU-tt)P 

» 1(5^ iu«)nG«)p) - 2$^ iuK)nu40P 

CeG 

^ 3^^^E(1b)E(1sO' - E 

« 6 G 


nsing (IHSD and the dehnition of R. On the other hand, from Plancherel’s identity we 
have 

EiiA(or = E(iA) ^5E(is), 

? 6 G 


and hence 

1a * 1a' * 1-A * l-A'(^) ^ (^g ~ ■^)®"(1 -b)®(1-B')^ ^ 0) 

which implies that x is contained in Al + ^4' — A — ^4' and hence in 2^4 — 224. Hence we 
have 

B{R,^) C 221-2A 

We are not done yet, becanse we do not have good bonnds for \R\. Let 0 > 0 be a 
small parameter to be chosen later. Invoking Corollary 18.7L we conclnde the existence 
of freqnencies ^i,..., G G with k ^ 1285^ snch that 


RC{^eG:\\^- ^j\\B{s,ep) ^ ^Od for some 1 ^ j ^ k}. 

Let S' := {.^ 1 ,... If X G B{S U S', Op) then in particniar x G B{S,0p). Thns if 
^ & R, then by the preceding incinsion we have 


11 ^ • X - Cj ■ x\\m/i. ^ ‘2‘^^S ^9d for some j. 
Also, since x G B{S', 9p), we get 


110 ■ <9p^9; 

by the triangle ineqnality we then obtain 

||e-x||M/z^229rtd. 

We thns conclnde that 

B{S U S', 9p) C B{R, 2^^S-^9d). 

Thns if we choose 9 := 2~^^6^/d, we have B{SUS', 9p) C B{R, A), and since B{R, A) 
2 A — 2A the claim follows. 


□ in 
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9. The general group case 

We now prove Theorem 12.71 which generalizes Theorem 12.HI to the case of arbitrary 
hnite additive groups G. We begin by disposing of part (ii) of the theorem, which is 
rather easier to establish than (i). 

Recall that in order to establish Theorem o (ii) we are to prove that if S' C G is a set 
of d characters, ii B = B{S, p) is a regular Bohr set, if / : G —"D is a bounded function 
and if \\f\\u3[y+B) ^ V then we have 

wfWuHG) > iv^pvcd^r 

for some absolute constant G. By translation invariance we may take y = 0. Let 
0:1?—i>R/Zbea locally quadratic phase function on B, and suppose that 

\^xeB{f{x)e{-(j){x)))\ = p. 

It turns out to be convenient to have 0 dehned, and to be a quadratic form, on a 
slightly larger Bohr set than B. This is not in general possible, but the same effect can 
be achieved by hrst passing to a smaller Bohr set. In fact, in the argument which follows 
we will have two smaller Bohr sets B' = B{S,p') and B" = B{S,p"). Set £ = crj/d, 
where c is a small constant to be specihed later. We will take p' G [ep/2,ep\ so that B' 
is regular (this is possible by Lemma l8.2j) and p" = ep' (we will not require B" to be 
regular). It will be convenient to write 0' := Els' and (3” := El^//. By Lemma (4.21 we 
have 

p = + 0{ed). 

Observe that the contribution from ^ G B\B{S, (1 — 10£)p) is at most 0{ed), thanks to 
the regularity of B, Thus we in fact have 

^xeB{f{x)e{-(j){x))) = E zeRlB(5,(l-10e)p) {z)E^(z^+B'{f{x)e{-(j){x))) + 0{ed), 

and hence by the pigeonhole principle there exists z & B{S, {1 — 10£)p) such that 

\^x&z+B' (/(a;)e(-0(x)))| ^ 2p/3 (9.1) 

provided that c is chosen sufficiently small. 

We are going to compare R+b', which is relevant to (19. Ij) . with the function 

F{x) := ^Ehlz+B"{x + h)l^+B'{x + 2h). 

Write B'_ = B{S, (1 — 2e)p') and B'^ = B{S, (1 + 2e)p'). Note that ii x E z + B'_ and 
X + h E z + B" then x + 2h = 2{x + h) — x is contained in ^ + B'. For such x, then, we 
have F{x) = 1. Also, if F{x) ^ 0 then there is some h such that x + h E z + B" and 
X -E2h E z + B', which means that x = 2{x + h) — {x + 2h) lies m. z + !?(,_. We have, 
then, 

\F{x) - 1^+B'{x)\ ^ 21b'^\b'_{x). (9.2) 

Now note further that ii x + h E z + B" and x + 2h E z + B' then 

xEz + B' + 2B'' C R, 

and also 

X F?)h = 2{x + 2h) — {x + h) E z + 2B' + B” C B. 
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Both of these are consequences of the fact that 2 ; G — 10e)p). In such an 

eventuality, then, all four of the elements x,x + h,x + 2h, x + 3h he in B and, since 0 
is quadratic, we have {h ■ = 0, or in other words 

4>{x) — 34>{x + h) + 30(x + 2h) — (j){x + 3h) = 0. (9.3) 

Therefore 

F{x)e{-(t){x)) = ^'Ehlz+B"{x + h)e{3(t){x + h))l^+B'{x + 2h) X 

X e(—30(a; + 2h))e{(j){x + 3h)) 

= : ^¥.hgi{x + h)g2{xF2h)g^{x + 3h), 

say, where the functions gi,g 2 igs are all bounded by 1. It is immediate from (in2D that 

\F{x)e{-(j){x)) - l^+s/(x)e(-0(a:))| < 21b'^\b’_{x). 

From (iniii) we infer, then, that 

■^Ex,hf{x)gi{x + h)g 2 {x + 2h)g3{x + 3h) = Exf{x)F{x)e{-(j){x)) 

= Exf{x)lz+B'{x)e{-(j){x)) +0(E1b^\s^) 
= Exf{x)lz+B’{x)e{-(j){x)) +0{ed(3'), 

the penultimate step being a consequence of the regularity of B'. If c is chosen small 
enough, this means in view of (iniiD that 

^x,hfix)giix + h)g 2 {x + 2h)g^{x + 3h) ^ f3'fd"r]/3. (9.4) 

However Proposition 11.71 implies that we have 

\Ex,hf{x)gi{x + h)g 2 {x + 2h)g^{x + 3h)| ^ \\f\\uHG), 
and hence from El we have 

II/I|g3(g) ^ /9'/5"r^/3 ^ is^pV^rv/3 ^ 

for some absolute constant C. □ 

Now we turn to the proof of Theorem 12.71 fii. As in starting point is Proposition 

El which the reader may care to recall now. The argument is closely analogous to that 
in Ijni and hence in turn to that in 

Step 1: Linearization of phase derivative. We begin by carrying out the hrst major 
step, which is to show that the function h 1 —which roughly speaking captures the 
derivative of the phase of /, matches up with a locally linear function. 

Proposition 9.1. Let H' C G, and suppose that ^ FI' ^ G is a function whose graph 

T' ;= {(h,^) ■.heH'}FGxG 

obeys the estimates 

K-^N ^ |P'| ^ |9P' - 8P| ^ KN 
for some K ^ 1. Then there is a set S F G, 

di := 1^1 ^ 
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a regular Bohr set Bi := B{S,p), where p G elements Xq E G,^ E G and a 

function M : B{S, j) ^ G satisfying the local linearity conditionf^ 

M{h ± h') = Mh ± Mh' whenever ||/i||s, ^ (9.5) 

and such that 

E(l/f/(xo + h)l^^^^^=2Mh+io\h E Bi) ^ 

Proof. As in the finite field case, the first step is to refine the graph F' so that certain of 
the iterated sum-difference sets kT' — IT' are also graphs. To do this we use the following 
generalization of Lemma 16.21 

Lemma 9.2. There exists a subset F" = {{h,f,h) ■ h E H"} ofT' with 

|F"| ^ 2-^K-^^N 

such that 4F" — 4F" is a graph. 

Proof. Let A C G be the set of all such that (0,.^) G 8F' — 8F'. Arguing as in the 
proof of Tvemma 16.21 we conclude that |A| ^ iF^. Applying T^emma 18.81 we can find a 

set S' G G with [S'! ^1-1-2 log 2 K such that A ft i?(S, |) = {0}. 

Let 4/ : G —>■ (M/Z)'^ be the homomorphism 4/(^) := (s(^))se 5 . Now let us cover the 
torus (M/Z)'^ by ^ 2®iF^^ cubes of side-length A. Since |F'| ^ N/K, the pigeonhole 
principle implies that there exists one of these cubes Q for which the set 

r":={(h,a)GF':vl>(e,)eg} 

has cardinality at least 2~^K~^^N. Now observe from the linearity of s that if (0,.^) G 
8 F" — 8F" then ||s(OlliR/z ^ H for all s E S. In other words, E B{S, |). But f, also 
lies in A, and hence .^ = 0 by construction. Since 8F" — 8F" is the difference set of 
4F" — 4F", we conclude that 4F" — 4F" is a graph as desired. □ 

Define H” so that F" = {(h, : h G H''}. Applying Lemma IFF!?! with A := H”, 

we obtain a set S' C G with [S'! ^ such that the Bohr set Bq := i?(S', |) is 

completely contained inside 2H" — 2H". We will now work inside this Bohr set Bq and 
pass to progressively narrower Bohr sets Bi,B 2 ,... when necessary. We will eventually 
end up at B^; the dimension of Bj will be denoted dj, and so in particular do = I S']. It 
will turn out that do = di = d 2 < d^ = d^ = d^, that is to say it is only in passing from 
B 2 to Bo that we shall increment the dimension of Bj. This is because that passage will 
involve Lemma IHIHl 

Since 2F" — 2F" is a graph, we can find a (unique) function M : Bq ^ G such that 

{(h, 2M{h)) :hEBo}C 2H” - 2H”. 

Since 2F" — 2F" contains 0, we conclude that 0(0) = 0. Also, since 8F" — 8F" is a graph 
we see that 

M{hi) + M{h2)=M{h\)+M{h'.2) (9.6) 

whenever hi,h 2 ,h\,h 2 E Bo is such that hi + h 2 = h[ + h' 2 ., in other words, M is a 
Freiman homomorphism of order 2. In particular, since MO = 0, we have the local 
linearity relationship (jn^D- 

^Aecall that \\h\\s := sup^gg Uh\\w/i. 
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By Lemma IHm there is p G [^, |] such that the Bohr set Bi := 5(5', pi) is regular. By 
Lemma mu there exists Xq G G such that 

^ 2 

Let us £x this Xq, and set A := {h E Bi : xq + h E H"}, so that we have 

|A| ^ 2 -^K-^^\Bi\. 

Observe that if h, h' E A, then {h — h\^xo+h — ixo+h') lies in L" — L", which is a subgraph 
of 2T" — 2T'' . Thus we have ^a;o+/i — ixo+h' = 2M{h — h'). Combining this with ()9.5|1 . 
we conclude that there exists ^ G such that 

ixo+h = ^0 + 2Mh for all h E A. 

This concludes the proof of Proposition 19.11 □ 

Combining Proposition 19.11 with Proposition 15.41 leads immediately to the following, 
which generalizes Proposition IHU] to arbitrary G. 

Proposition 9.3 (Large 17^(G)-norm implies locally linear phase derivative). Let G be 
an arbitrary finite additive group, and let f : G ^ V be a bounded function such that 
ll/llt/3(G) ^ V fo'^ some p > 0. Then there exists a set S <E G with 

di := 1^1 ^ 

a regular Bohr set Bi := B{S,p) C 5(5, |) = Bq with p G [^, |], elements xq E G and 
fo E G, and a function M : Bq ^ G obeying the local linearity property (EUD, such that 

E/,6Bi|E,eGT"°+V(x)7Me(-(eo + 2Mh) ■ x)| ^ ( 9 . 7 ) 

We could take Gi, G[ = 2 ^^, i = 3,4. 

Step 2: The symmetry argument. Let S, Bi,xo,^o, M be as in Proposition 19.31 Using 
the proof of Theorem 12.31 as a model, the next step would be to establish some symmetry 
property on M : 5o —>■ G, in the sense that the form {x,y} := M{x) ■ y — Mfy) ■ x is 
small. More precisely, we shall establish 

Lemma 9.4 (Symmetry of derivative). Let the notation be as in Provosition \9.,‘A For 
any x,y E Bq, let {x,y} denote the anti-symmetric form 

{x, y} := M(x) ■ y — M{y) ■ x. 

Then there exists a set S 3 of freguencies with S 3 ^ ^ ■ S and ds := | 5 ' 3 | ^ and 

a Bohr set B 3 = 5 ( 53 , 2“'"®p'"6) C 5i, such that 

\\{x,z}\\^/'E ^ 2 ^'^ri~^''^\\x\\s 3 for allx,z E B 3 . (9.8) 

It is permissible to take all of the Gi, G', i = 5, 6 , 7, equal to 2^^. 

Proof. Let 62 = 2 “‘" 3 - 2 G 4 -i 0 j.^g^+ 2 G^_ gy Lemma EIU we can hnd p 2 G [£ 2 , 262 ] such 
that B 2 := 5(5", P 2 ) C 5i is a regular Bohr set. Of course we have 

d 2 = di^ 2 ^^r]-^L 

■x&G^{x + h)h{x)h{h)e{—2Mh ■ x)\ ^ 2~^‘^r]^'^, 


We write dnn) as 


( 9 . 9 ) 
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absorbing all the phase terms into the functions b. Applying Lemma 14.11 we can hnd 
Xi E G such that 

\^h£Bi;x£B2^{^ + Xi + h)h{x + Xi)h{h)e{—2Mh ■ {x + Xi))| ^ (9.10) 

Absorbing the Xi terms into the functions b we conclude that 

|Eh 6 Bi;a;eS 2 b(h)b(x + h)h{x)e{-2Mh ■ x)\ ^ 2~^‘^ri^i. (9.11) 

Applying the Cauchy-Schwarz inequality fLemma l4.d|l to eliminate b(h), we then deduce 

'^h&Br,x,y&B2^{x + h)h{x)h{y + h)h{y)e{-2Mh ■ {y - x)) ^ (9.12) 

Making the substitution z := x + y + h, this becomes 

Ex,yeS 2 E^ex+y+S 2 b( 2 ;, x)h(z, y)e(-2M(z - x - y) ■ (y - x)) ^ (9.13) 


Absorbing as many phase terms into the functions b(^, x) and h{z,y) as we can, we 
conclude 

^x,y&B 2 ^z&+y+BMz,x)h{z,y)e{ 2 {x,y}) ^ 

Next, by the regularity of Bi and Lemma E21 (i), we observe that 

\R:,^y^B 2 ^z&x+y+B^^{z, x)b( 2 ;, i/)e(2{x, y}) 

-"^x,y&B2^z&B^i{z,x)h{z,y)e{2{x,y]) \ ^ 2^e2d 

which, due to the choice of 62 and the bound d ^ implies that 

Ex,yeH2E^eSib(^,x)b(2;,?/)e(2{x,|/}) ^ (9.14) 

In particular, by the pigeonhole principle in 2 ; we have 

|E^,y6B2b(x)b(?/)e(2{x,2/})| (9.15) 

for some bounded functions b(x),b(i/). At this point we observe the local bilinearity 
relationships 

{x + x\y} = {x,y] + {x ,y}] {x,y + y'} = {x,y} + {x,y'}, (9.16) 

which hold whenever all four of ||x|| 5 , ||x'|| 5 , Hi/Hs, ||l/1|s are at most |. We can then 
apply Cauchy-Schwarz iLemma l4.3j) to eliminate b(x) and conclude that 

|E^,yy 6 S 2 b(i/,i/')e( 2 {x,i/' -y})\^ 2-^^4-2^4C'^ 

and hence by the triangle inequality 

E,yeR2|E.6B2e(2{x,i/' -i/})| ^ 2-^^^-V^4. 

By the pigeonhole principle, there exists y' G B 2 such that 

E,eB2|E.6B2e(2{x,i/' - y})\ ^ (9.17) 

Fix this y'. Since ~ I/})I is bounded above by 1, we conclude that there 

exists a set A C B 2 with |A| ^ 

|Ea;6B2e(2{x,i/' -y})\^ 2“^'^4-3^4C' y E A. 

Applying Lemma l^Ol land recalling that ^2 ^ 2 ‘"^ri ~‘^3 and P 2 ^ £2 = 2 “‘^ 3 - 2 C' 4 -io^c^+ 2 C^^ 
we conclude that 

\\2{x,y' - y}\\R/z < 2^'^+^*^3+8C'^-2C'-iog'||^||^ x E B 2 ,y E A. 
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Applying (l9.1(iD (and recalling that p 2 ^ 2 e ^ A), we conclude that 

||2{x,^}||r/z ^ X e B 2 ,z e2A- 2A. 

On the other hand, by applying Lemma 18.81 we can hnd S" C G with 15 U S"| ^ 
2i2C4+i6^-i2C^ g Bohr set := B{S U 5"^ 2 “®^“^‘" 3 - 26 C 4 ^ 2 C^+ 26 C^^ which is com¬ 
pletely contained in 2A — 2A and inside B 2 . Thus we have 

\\ 2 {x,z}\\^/'Z ^ 2 ^®+ 2 *^ 3 + 8 C'^- 2 CG 10 C'||^||^ gjj ^ 

Let us now eliminate the factor 2. Observe that B^ := {2x : x G B'^} is also a Bohr set 
(with the frequency set 5 U 5' replaced by S 3 := ^ ■ {S U S')). Since ||x ||5 ^ 2||x||53, 
also observe that B 3 C B{S, 2 ^ 2 ) ^ Bi. By (Ib.lfij) . which implies that {2x, z} = 2{x, z}, 
we conclude ()9.8j) as desired. □ 

There are extremely close analogies between the above argument and that of ^ Equa¬ 
tions ()9.9|1 , (Ib.lOj) , (I9.11|l , (19.121) , ()9.18|1 , ()9.14D , (I9.15|l and (I9.17j) are analogous to ()b.7|l , 
(ESI), ESD, (1^^ - ESH), (1^^ - (1^^ and (lurm respectively. 

Step 3: Eliminating the quadratic phase component. We now return to the conclusion 
of Proposition 19.81 and localize the x and h variables to a small Bohr set. Let 

£4 = min ( 2 -^ 3 -C 4 -io^c'+ci^ 2 - 5 -^^ 4 -C 7 ^c'j^ 

let S 3 be the set of characters coming from the previous subsection, and let i ?4 : = 
B{S 3 ,pi) C i ?3 be a regular Bohr set such that G [£ 4 , 254 ]. By the previous estimate 
we have 

||{a;, ^}||r/z ^ for all x, z G S 4 . (9.18) 

Let us write EH) as 

|E/xeBi;xeGb(h)b(x h)f{x)e{-2Mh • x)| ^ 

where we have absorbed some phase terms into the functions b as before. Since Bi = 
B{S,p) for some p^ we conclude from T,emma f4.2l tiii that 

lE/ieSi;xeGb(h)b(x -1- h)f{x)e{-2Mh ■ x) - 

IEh'eBi;heS 4 ;a:eGb(h h')b(x + h + h')f{x)e{-2M{h + h') ■ x) \ ^ 2 '^£ 4 di. 

This is at most and therefore 

|E/i'eSi;heS 4 ;xeGb(h h')b(x + h + h')f{x)e{-2M{h + h') ■ x) \ ^ 

Hence by the pigeonhole principle, there exists h' G B^ such that 

\^hGBr,xeGHh + h')h{x + h + h')f{x)e{-2M{h + h') ■ x)\ ^ 

Since e{—Mh' ■ x) = e{—2Mh' ■ (x -I- h))e{2Mh' ■ h), we can absorb all the h' terms into 
the functions b to conclude that 

|EheB 4 ;a:eGb(h)b(x - 1 - h)f(x)e(-2Mh ■ x)| ^ 

By Lemma (4. II we then have 

|Ej; 6 G;x,heS 4 b(h)b(x + y + h)f{x + y)e{-2Mh ■ {x + y))\ ^ 
and hence by the triangle inequality 

^yeG\^xMBMh,y)Hx + h,y)f{x + y)e{-2Mh ■ x)\ ^ 2 “^^“^? 7 ‘^y 
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Now we observe from (inii) that 

2Mh ■ X = M{x + h) ■ {x + h) — Mx ■ x — Mh ■ h — {x, h}, 
and hence by (imii 

|e(— 2Mh • x) — b(x + h)h{h)e{Mx ■ x)| ^ 27r ■ ^ 2~'"'^~‘^ri^K 

Thns we have 

EyeGlEx,heH 4 b(h, y)h{x + h, y)f{x + y)e{Mx • x) | ^ . 

It is convenient to localize x farther. Let £5 = By Lemma 18.21 we 

can hnd a regnlar Bohr set = Bi^S^^p^) with G [£ 5 , 255 ]. By Lemma W7]\ fiii we 
have that 

Ey(zG\^x,heB 4 Hh,y)H^ + Ky)f{x + y)e{Mx ■ x)| 

-Ey^G\^w,h£Br,x£B 5 Hh, 2 /)b(x + w + h,y)f{x + w + y)e{M{x + w) ■ {x + w)) | 

is at most 2 ^^ 65 ( 13 / 84 , which on acconnt of the choice of £5 implies that 

Ey(zG\^w,h&B 4 -,x&B 5 Hh,y)Hx + w + h,y)f{x + w + y)e{M{x + w)-{x + w))\ ^ 

By the pigeonhole principle there exists w & B 4 snch that 

Ey(zG\^heB 4 -,xeBMh,y)h{x + w+ h,y)f{x+ w+ y)e{M{x+ w) • (x + tc))| ^ 

Let ns now apply Lemma f4.41 Since B 4 is regnlar and £5 is so small, we certainly have 
E( 1 b 4 )/(IE 1 s 4 +s 5 ) ^ 1 / 2 , and so that lemma allows ns to conclnde that 

Ey^G\\f{x + W + y)e{M{x + w) ■ {x + M;))||„2(Bg) ^ 

This, of conrse, implies that 

EyeG\\f{x + w + y)e{M{x + w) ■ {x + M;))||„3(Sg) ^ . 

The norm being invariant nnder translation, conjngation and qnadratic phase mod- 
nlation, we conclnde that 


^yecWfWuHy+w+B,) > 2 V"- 

Making the change of variables y i—^ y + w, and completing a small compntation, we 
obtain (EHl) as desired. □ 

We remark that we have proved slightly more than (I2S1), in that the qnadratic phase 
fnnctions nsed to demonstrate the largeness of the ||/||tj 3 (y+s) norm all agree np to 
lower order (i.e. linear and constant) terms. However we were nnable to hnd any way 
to exploit this additional fact. 
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10. Bohr sets and generalized arithmetic progressions 

Our focus from this point on is largely on the group G = 'L/N'L, as we are working 
towards connections with themes in ergodic theory, in particular involving Z-actions. 
As we have stressed, Z/A^Z is an appropriate group to consider if one is interested 
in discrete questions concerning the integers. However some of what we have to say, 
particularly in the present section, can be generalized without undue pain to arbitrary 
additive groups. 

We have obtained an inverse theorem. Theorem EH for the norm which relates that 
norm to the quadratic bias norm + B) on Bohr sets B (or on subspaces W, in 
hnite held cases such as G = Fg). This is a fairly satisfactory state of affairs, except 
for the presence of the Bohr set B; in particular, it is not clear at present what exactly 
the locally quadratic phase functions are on B. In this section we show how the Bohr 
set can, if desired, be replaced with a generalized arithmetic progression, and how to 
characterize the locally quadratic phase functions on such progressions. 

We begin by recalling what a generalized arithmetic progression is. 

Definition 10.1 (Generalized arithmetic progression). A generalized arithmetic pro¬ 
gression P in an additive group G is any set of the form 

P ;= {a + liVi + ... + IdVd : 0 ^ < Lj for all 1 ^ j ^ d} 

where d ^ 0, a,Vi,... ,Vd E G, and Li,... ,Ld ^ 1. We shall abbreviate the right-hand 
side as P = a -\- [0, L) ■ v, where L := (Li,..., Ld) and v := (ui ,... ,Vd). We call a the 
base point of the progression, d the rank, Vi,... ,Vd the generators, and Li,... ,Ld the 
lengths of the progression. If all the sums in P are distinct, so that thus |P| = Pi... Ld, 
we say that P is proper. A coset progression is any set of the form P H where H is 
a subgroup of G. We say that the coset progression P -|- PT is proper if P is proper and 
|P -|- P| = |P||P| (i.e. all the sums in P -|- PT are distinct); we dehne the rank, base 
point, etc. of the coset progression P -|- PT to be the same as that of its component P. 

The need to generalize from generalized arithmetic progressions to coset progressions 
in the setting of a general group G was hrst noted in [221 • 

We now use standard facts from the geometry of numbers to show that every Bohr 
set contains a large proper coset progression. The hrst lemma follows from a result 
of Mahler (HH Chapter VIII, Corollary to Theorem VII]) together with Minkowski’s 
Second Theorem (loc. cit. Chapter VIII, Theorem V). This result (in fact, a rather 
stronger one) was used in an additive-combinatorial context in Bilu’s work on Freiman’s 
theorem 0 Lemma 2.1]. 

Lemma 10.2. Let T be a lattice of full rank in Then there exists linearly indepen¬ 
dent vectors wi,... ,Wd which generate T, and such that 

|tci| ... \wd\ ^2 ■ d\ ■ mes(M'’*/r), (10.1) 

where mes(M'^/r) is the volume of a fundamental domain o/T. 

course, the classification of finite abelian groups tells us that every coset progression is also 
a generalized arithmetic progression (by expanding H as the direct sum of cyclic groups, which can 
each be interpreted as an arithmetic progression) but in doing so one can cause the rank of the coset 
progression to increase enormously (by the number of generators needed to span H). 
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Next, we give a “discrete John’s theorem” which shows that the intersection of a convex 
symmetric body and a lattice of full rank is essentially equivalent to a progression. 

Lemma 10.3 (Discrete John’s theorem). Let B be a convex symmetric body in W^, and 
let T be a lattice in of full rank. Then there exists a d-tuple 

w = {wi ,..., Wd) e r'’* 

of linearly independent vectors in T and and a d-tuple L = (Li,...,Ld) of positive 
integers such that 

■ 5) n r c (-L, L) - w c B nr c {-d^^L,d'^^L) ■ w. 

Here of course 

(—L, L) ■ w := {hwi + ... + IdWd : —Lj < Ij < Lj for all 1 ^ ^ d}. 

Proof. We hrst observe using John’s theorem |121 (see also 13 Hi) and an invertible 
linear transformation that we may assume without loss of generality that C i? C 
d ■ Bd, where Bd is the unit ball in M'’*. We may also assume d ^ 2, since the claim is 
easy otherwise. 

Now let w = {wi,..., Wd) be as in Lemma fl 0.21 For each j, let Lj be the least integer 
greater than l/d\wj\. Then from the triangle inequality we see that \liUJi-{-.. .-\-ldWd\ < 1 
whenever \lj\ < Lj, and hence {—L, L) ■ w is contained in Bd and hence in B. 


Now let X G i? n r. Since w generates F, we have x = liWi + ... + IdWd for some integers 
h,..., Id', since B C d-Bd, we have |x| ^ d. Applying Cramer’s rule to solve for li,... ,ld 
and (UnnD, we have 


= 


X A tci ... Wj^i A Wj+i A Wd\ ^ 

Ixllwil .. 

• \Wd\ 

Iwi A ... A Wdl 

tCj tci A . 

..AWd\ 


|x|mes(M'’*/F) 2d ■ d\ 




\Wn 


\Wi 


which is certainly at most df'^Lj. It follows that x G {—df'^L, df'^L) -w, which is what we 
wanted to prove. A more-or-less identical argument gives the inclusion (d“^‘^ • S) fl F C 
(—L, L) ■ w. □ 


Let x I—>■ {x} denote the fractional part map from R/Z to the fundamental domain 

(- 1 / 2 , 1 / 2 ], 

Lemma 10.4 (Bohr sets contain large coset progressions). Let S (L G be a set of d 
characters, let p < 1/4 be a real number, and let B{S,p) ^ G be a Bohr set. Then there 
exists a proper coset progression P + H of rank d', 0 ^ d' ^ d, where P = {—L, L) ■ v 
for some Li,... ,Ld' ^ 1 and Vi,... ,Vd' E G, and we have the inclusions 

B{S,T-^‘^'p) CP + H CB{S,p). (10.2) 

In particular, from Lemma \8.1\ we have 

\P + H\^ p^d-'^^^N. (10.3) 

Furthermore, the vectors ({^ • Vj})^^s ^ 1 ^ j ^ d', can be chosen to he linearly 

independent, and H can he taken to he the orthogonal complement of S, that is to say 
the group 


H ■.= {x E G ■. f ■ x = D for all S'}. 


( 10 . 4 ) 







44 


BEN GREEN AND TERENCE TAG 


Remark. The lemma is at the same time a rehnement and a weakening of a lemma 
from 1^. The rehnement, corresponding to the fact that we nse Lemma ll(). 2 l rather 
than Minkowski’s second theorem, is that we obtain the left-hand inclusion in (inoD 
and not just the right-hand one. The weakening is that using just Minkowski’s second 
theorem (and thus sacrihcing the left-hand inclusion in (j 1 (). 2 j) j gives a stronger bound 

than (HIESD. 

Proof. Let 0 : G —(M/Z)'^ be the group homomorphism (j){x) := ■ x)^^s- Observe 

that 0(G) is a hnite subgroup of the torus (M/Z)'^, and that B{S, p) is the inverse image 
of the cube Q := {(ygj^es : | 2 /?| ^ p} under 0. 

Let T C be the lattice 0(G) -|-Z‘^. Though it is a slight abuse of notation, we consider 
0(G) n Q to be the same as T fl Q. Applying Lemma llO.dl we can hnd a progression 
P := (—L, L) ■ w for some linearly independent Wi,..., Wd' P T with 0 ^ d' ^ d such 
that 


T n ■ g c p c T n Q. 

Since the Wj are independent, P is necessarily proper. The claim now follows by setting 
Vj to be an arbitrary element of (j)~^{wj) for each 1 ^ ^ d', and setting H equal to 

the kernel of 0, which is of course just (unsD- □ 

When G = Z/A^Z is a cyclic group of prime order, the subgroup H has no role to play^^ 
and we conclude the following corollary. 

Corollary 10.5. Let G = Z/A^Z he a cyclic group of prime order, let S P G be a set 
of d characters, and let p < 1/4 be a parameter. Then there is a proper generalized 
arithmetic progression P = (—L,L) ■ v of rank at most d and size at least p'^d~^'^^N 
such that B{S,d~‘^'^p) P P P B{S, p). Furthermore, the vectors linearly 

independent in 

In this paper, it is in general more convenient technically to work with Bohr sets than 
progressions or coset progressions. However, there is one task which is much easier 
to achieve on progressions than on Bohr sets, and that is to classify quadratic phase 
functions: 

Lemma 10.6 (Inverse theorem for locally quadratic functions). Let G be a finite addi¬ 
tive group of odd order. Let P + H he a coset progression in G, let a be the base point 
of P, and let vi,... ,Vd be the generators. Let 0:P-|-id—^-M/Z be a locally quadratic 
phase function on P + H. Then there exists a self-adjoint homomorphism M : H H, 
elements ^ H, elements pi, \ij G M/Z for 1 ^ i, j ^ d, and c G M/Z such 


the other extreme, in the finite field geometry setting G = F 5 it is the progression component 
P which is irrelevant, because the properness of P forces all the lengths to be less than 5. Since the 
rank of P is also under control, we thus see that H is a substantial portion of P -I- il. Indeed, the fact 
that Bohr sets in finite field geometries contain large subspaces was already exploited in the proof of 
Theorem O 
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that Ajj — Xji and 


0 (a + liVi + ... + IdVd + /i) — Mh ■ /i + 2 'y ^ ■ h + ^ ^ hljXij 

i=l 

d 

+ ^0 ■ h + y + c 


(10.5) 


2 = 1 


for all li,..., Id, h with 0 ^ L < Lj for all 1 ^ j ^ d and h ^ H. 


Remark. In the converse direction, it is easy to show that ()10.5|1 is indeed well-defined 
and gives a locally quadratic function if 3P -|- P is a proper coset progression, but we 
will not need that fact here. 


Proof. We may assume that Lj ^ 2 for all j, and we may translate so that a = 0. Let 
be the restriction of f to H. By Theorem 13.21 the quadratic extension theorem, 
we can extend fin to a globally quadratic phase function on G. Using Lemma EUl 
it is easy to see that f), when restricted to P -|- iL, has the form (1IIE3). Thus we may 
subtract off if from (f, which means that (f now vanishes on H. We now claim that 
under this reduction, (f takes the simpler form 

d d 

(f{liVi IdVd + h) = ■ h -|- hljXij + liTji. 

i=l l^ij^d i=l 

Observe that for any h E H, the function {h-'V)(f is locally linear on P-|-P and vanishes 
on H, and hence takes the form 

d 

{h ■ V)(f{livi + ... IdVd + h') = li{h ■ V)(f{vi) 

2=1 

for all li,..., Id, k with 0 ^ lj < Lj and h' G H. It is then easy to see that h h-^ 
{h-'V)(f{vi) is a group homomorphism from H to M/Z and thus there exists fiEH such 
that {h ■ V)(f{vi) = 2f,i ■ h for all h E H. Using this, we thus reduce to showing that 


(fihvi 


+ idVd) = 


liljXij 


or equivalently that 


d 

E 

2=1 


kVi- 


d 

0(/l'Ui “t" . . . Id'^d) ^ ^ A^ji' ~h ^ ^ “1“ li 

l^i<j^d 2=1 

We induct on d. When d = 0 there is nothing to prove. Now suppose that the claim is 
already proven for d — 1. We observe that the derivative {vd • V)0 is linear, and hence 


{vd ■ V)0(/iTi ... -h IdVd) = ‘2liXid + Vd 

l^i^d 

for some Ai^,..., Xdd, Vd ^ M/Z, with the caveat that Id now must be less than Ld — I 
rather than Ld. (Note that it is always possible to divide by two in R/Z, though 
the value obtained need not be unique). The claim then follows from the induction 
hypothesis and a simple “integration” argument which we omit. □ 
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Now, let US specialize to the setting of cyclic groups Z/iVZ of prime order. We begin 
by defining some special functions on this set. 

Definition 10.7 (Bracket polynomials). Let Z/A^Z be a cyclic group of prime order. 
If A: ^ 0, we dehne a bracket monomial of degree k on Z/iVZ to be any function 0 : 
'LjN'L —> M/Z of the form 

0 (x) = a{^i • x} ... {fk ■ x} mod 1 

where £ Z/A^Z and a G M; we refer to as the frequencies of the 

monomial. If L ^ 0 and S C 'LjN'L, we dehne a bracket polynomial of degree at most 
k, length at most L and frequency set S to be any function f : 1,1 —>■ M/Z which 

can be expressed as the sum of L or fewer bracket monomials of degree at most k and 
frequencies inside S. We write Freq((/)) C S. Note that if IS”! = d then we may always 
assume that L ^ kdf] for this reason there will be little subsequent discussion of length. 

These bracket polynomials are special cases of the generalized polynomials considered 
in various papers of Haland, Haland-Knuth, Bergelson and Leibman. laisiEniEiiEHi, 
though with the (minor) caveat that our fractional parts take values from — | to 
while the ones in those papers take values from 0 to 1. 

Dehne a bracket quadratic to be a bracket polynomial of degree at most 2. We can now 
link locally quadratic phase functions with bracket quadratics. 

Proposition 10.8. Let G = IjNI be a cyclic group of prime order, let S ^ G be a set 
of d characters, and suppose that p G (0, |]. Let P be the proper progression contained 
in B{S,p) which was constructed in Corollary lid. ,51 and let f : B{S,p) —>• M/Z be 
a locally quadratic phase function on B{S,p). Then there exists a bracket quadratic 
0 : l/NI —>■ M/Z with Freq(0) C S such that f f on P. 

Proof Let Vi,... ,Vd and Li,... ,Ld be the generators and lengths of P. By Lemma 
fTTn)l we have 

d 

fihvi + . . . + IdVd) = Urji + c mod 1 (10.6) 

2=1 

for some real numbers Ajj, pi. 

Next, let $ : P —>• M'^ be the map 

$(x) :={({e-x})^,5 ). 

Since P C B{S,p), we see that ‘h(P) lies inside the cube Q := {{y^)^£s ■ Idcl ^ 
p for all ^ G S'}. Since p < i, it is also easy to verify that 

<l>(/ini + ... + IdVd) = h^ivi) + ... + ld^{vd). 

From Corollary IK). 51 we know that the *F(nj) are linearly independent. Thus there exists 
a vector ut G such that 

^{hVi + . . . + IdVd) ■ Ui = k. 

Writing Ui = aiid x = liVi + ... + IdVd we conclude that 

h = 

«65 
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Inserting this formula into (IKI.hl) we obtain the claim. □ 

We can now give a version of Theorem 12.71 the U^{G) inverse theorem, in the case 
G = 'Ll NT, ^ which involves bracket quadratic functions. In this theorem Cq, Ci,... and 
c denote absolute constants which do not vary from line to line. 

Theorem 10.9 (Inverse theorem for U^{L/NL), bracket quadratic functions). Let r] G 
[0,c), and let LjNL he a cyclic group of prime order. If f : LfNL V is a hounded 

function such that \\f\\u^(z/NZ) ^ then there exists a set S ^ LfNL of size d ^ 
and a proper progression P of rank at most d and size |P| ^ exp(— with the 
inclusions 

B(S, exp(-,r^')) CPC B{S, i), (10.7) 

and there exists a generalized quadratic cf : LfNL —>• M/Z with Freq(0) C S such that 
we have the local quadratic bias estimate 

( 10 . 8 ) 

for some h G LfNL. More generally, for any non-empty set AGP there exists an 
La G LjNL such that 

\E,^A{T^^f{x)e{-<Pix)))\ ^ (10.9) 

Furthermore, there exists another generalized quadratic (j) : LfNL M/Z OTf/iFreq(0) C 
S such that we have the global quadratic bias estimate 

|Exez/ 7 vz(/(a:)e(- 0 (a;)))| ^ exp{-p~^^). ( 10 . 10 ) 

Conversely, suppose that S is a set of d frequencies and that cf : LjNL —>• M/Z is a 
bracket quadratic with Freq(0) C S. Suppose that f : LfNL ^ V is a function such 
that \E{f {x)e{—(f){x))\ ^ p. Then we have 

\\f\\u^ > {p^pVG.d^y. ( 10 . 11 ) 

Proof. Applying Theorem 12 . 7L we can find a regular Bohr set B := B{S,p) in LjNL 
with d = [S'! ^ p~^° and p ^ p^°, a 2 / G LjNL, and a locally quadratic phase function 
(fo : B —y LjNL on B such that 

KMTyf{x)e{-Mm\>v^^- 

We could take Gq = 2^®. Next, let £ = r^^Co+ii apply Corollary 1 10. 51 to find a proper 

progression P of rank at most d such that 

B{S,eexp{—p~'^°~^)p) CPC B{S,ep). 

By Lemma Hill (iii), we can hnd w G B{S, (1 — e)p) such that 

|E^ 6 ^+p(T^/(a;)e(- 0 o(x)))| ^ 

and (HIEHI) follows after translating by w and applying Proposition 11 0.8l A very similar 
argument gives (mEi. Note that the inclusions on P follow by choice of p, and the 
lower bound on |P| follows from Lemma o 

Now we prove P0.1()|) . From p().7j) and Lemma o we can hnd a regular Bohr set 
B' := B{S,p') with p' ^ exp(— 77 “*^^“^) which is contained in P. Applying (j 1 ().9j) we 
can hnd a shift h' G LjNL such that 

\E{T^'f{x)e{-ct>{x))lB'{x))\ ^ 
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Let s' = /y 2 Ci+ 9 ^ let X • [0,1] be a smooth cutoff such that x('S) = 1 when 

|s| ^ p'{l — s'), x('5) = 0 when |s| ^ p'{l + e') and such that the derivative estimate 
llx^lloo ^ 100/e'^p'^ holds true. By the regularity of B' we see that 

f{x)e{-4>{x))\[x{{i ■ x})| ^ - 200£'d)ElB' ^ 

? 6 S 

and this is at most exp(—by Tjemma l 8 . 1 l Next, we use Fourier expansion on 
M to write 

X(s) = / Xit)e{ts) dt where x(t) := / x(s)e(-ts) ds. 

This allows us to conclude that 

I /••• / f{x)e{-(j){x) + ■ x})]Y[xm)dt^\ ^ ( 10 . 12 ) 

Jr Jr 

Now set A := e'~^^^p'~^. Then by integration by parts, applied twice, we have 


\x{t)\dt = 


dt I 




x'{s)e{-ts)ds\ ^ — 


100 r°° dt 100 




e'p' Jx £ 1/2 


and whence 


This implies that 


|x(t)| dt ^ 2A||x||oo + ^ ^ 8Ap' + ^ ^ 2®£ 


/ • • • / n ^ exp(-p 

t/ M t/ M 


Comparing this with (I10.12j] . we conclude the existence of real numbers t^ for ^ E S 
such that 

\E^T’^' f{x)e{-(j){x) + t^{^ • x}) I ^ exp(-p“3^i“^) 

? 6 S 

and (jl 0 . 10 |l follows. 


Finally, we prove (jlO.I Ij) . Assume then that 0 is a bracket quadratic with Freq(0) C S, 
[S'! = d, and that / : Z/NZ V is a function with |E(/(x)e(—0(x))| ^ p. Set 
e := p/160d, and select a p G [e,2e\ such that B := B{S,p) is regular. Introducing an 
averaging over translates of 5, we see that 

|EyE,,6y+s(/(x)e(-0(x)))| ^p. (10.13) 

Now write U := B{S, ^ — lOp). We wish to exclude from (llO.ldj) those y which lie in the 
complement U'^ of U, since the bracket functions {^ • x}, ^ E S, fail to be linear when 
• x} ~ ±|. To this end, we estimate 

|EyExey+B(/(a^)e(-0(a;))lc/=(7/)| ^ Eyluc{y) 

^es 

^ 4:0dp ^ 80de, 
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the penultimate estimate following from the fact that N is prime, so that ^ ■ x takes on 
the values r/N, r G 'Ll NT,, precisely once each as x varies. Combining this with (llO.lHj) 
we see that _ 

\^y^x&y+B{f{x)e{-(t){x))lu{y)\ > r]/2, 
and hence there is y such that 

\^xey+B{f{x)e{-^{x)))\ ^r]/2. (10.14) 

We claim that 0 is a quadratic phase function on y + B. To check this, we must 
show that if the cube (x + coihi + uj 2 h 2 + <V3h3);^g{o,i}3 is contained in y + B then 
(hi ■ Vx){h 2 ■ Vx){hz ■ Va;)0(x) = 0. This is easy to prove once one appreciates that (for 
example) if x, x + hi G y + B and ^ G S' then {^-{x + hi)} = + hi}. Indeed, this 

identity is patently true (mod 1), and furthermore one has the bounds |{'C ■ x}| ^ | — 9p 
and |{^ ■ hill ^ 2p, whence |{^ • (x + hi)}| ^ ^ — 7p. 

Equation ()l().14j) . then, implies that 

\\f\\uHy+B) > h/2. 

The result is now an immediate consequence of Theorem 12.71 fiih □ 

11 . Application: a bound for r4(G'). 

As an application of Theorem 12.71 we obtain a bound for r 4 (G'), the size of the largest 
set A C G with no 4-term arithmetic progressions. 

Theorem 11.1 (Szemeredi’s theorem for G). Let G be a finite additive group of order 
N, where {N, 6 ) = 1. Then we have the bound 

rfiG) A(loglogiV)“'^ 

for some absolute constant c > 0 . 

The reader may hnd it helpful to recall Gowers’ argument j2S] in the case G = LfNL 
as explained, for example, in Our argument here will be similar, though we must 
handle torsion in G. Gowers did not use the so-called “symmetry argument” of ^ since 
he was able to apply the weak inverse theorem. Theorem 11.101 where we shall apply 
Theorem IT7I It is likely that if our only interest was in proving Theorem 111.11 then 
we could do likewise. However, as remarked in the introduction, our work is ultimately 
directed towards a study of 4-tuples pi < p 2 < Ps < P 4 ^ A of primes in arithmetic 
progression, and potentially towards the bound rfiL/NL) -C A(logA)“'’. For these 
applications one does need the full strength of Theorem 12.71 (when G = L/NL). 

The key to the proof of Theorem 111.11 is the following density increment result, which 
is in the spirit of Proposition 17.21 but rather more complicated. 

Proposition 11.2. Let G be an abelian group of size N, and suppose that all elements 
of G have order at most Suppose that ( 6 , A) = 1, that N is sufficiently large, 

that d ^ 1 /log log A is smaller than some absolute constant, and that A C G has size 
at least 6N. Suppose that A contains no 4-term arithmetic progression. Then there is 
some subgroup G' ^ G, |G'| ^ A^/^, together with a coset t + G' such that 

Exet+cAA^x) ^ ExeG^A^x) + , 

where G is some absolute constant. 
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Proof. Write a := El^ and / := 1a — o.. Applying Corollary 11.81 we have ||/||i73(G) ^ 
5^/8; applying Theorem 12.71 we can then find a regnlar Bohr set B{S,p) in G with 
\S\ ^ S~‘^ and p ^ snch that 

Applying Lemma ^31 we see that B contains a proper coset progression P + iL of rank 
at most 5“*^ snch that 

|P + P| ^ exp{-5-^)N. 

Since every element in G has order at most we see that the lengths of the proper 

progression P are also at most Thns |P| ^ which implies that 

( 11 . 1 ) 

By the definition of H, we see that B can be partitioned into cosets of H, and 

whence 

^y&G\\f\\u3{y+H) ^ 

Set a'{y) := Kx^y^HlA^x). From the triangle ineqnality we have 

||/IU3(j/+r/) ^ ||1 a g. (j/)|| u3(y+i/) + |cr (j/) o.\ 

= ||1a - a{y)\\u3{y+H) - 8 \a{y) - a] + ^\a\y) - a|, 

and so either 

E^ecilU - a\y)U^^y+H) - 8|«'(|/) - «| ^ 5^/2 (11.2) 

or 

^y&G\oi{.y) -a\> (5'^/18. 

Snppose the latter ineqnality holds. From T;emma f4.1l we have 

Ey6G(a (l/) - a) = 0; 

adding this to the preceding estimate and applying the pigeonhole principle, we conclnde 
that there exists y snch that 

¥.^^y+H^A{.x) = a\y) ^ a + ^'^/36 = E 3 ,gGlA(a:) + 5'^/36, (11.3) 

which implies the proposition (with a change to the absolnte constant G). Snppose, 
then, that (irr^ holds. By the pigeonhole principle, we can find y E G snch that 

||1a(x) - a{y)\\u3(y+H) > 8|a'(|/) - a\ + 6 ^/2. 

By translating A we may take y = 0. Writing a' := Kx£h1a{x) we conclnde the existence 
of a qnadratic phase fnnction (f : H R/Z snch that 

\^xeH{fH{x)e{-(j){x)))\ ^ 8|a' - a| + <5*^/2, 

where /iy(x) := 1^(3;) — a'. Applying Lemma ItOI for Lemma HO.bj) . we may thns find a 
self-adjoint homomorphism M ■. H ^ H and ^ E H snch that 

\^xeHfH{x)e{—Mx ■ x)e{—^ ' x)\ ^ 8|a' — a| -f /2. (11-4) 

As in the proof of Theorem l7.ll the next step is to locate a large snbgronp of H on which 
M vanishes. To achieve this we need some preliminary algebraic (and Fonrier-analytic) 
lemmas, of similar fiavonr to Lemma o 
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Lemma 11.3 (Orthogonal complements). Let K he any subgroup of H, and let C H 
be the subgroup 

■= {y ^ H \ Mx ■ y = 0 for all x € K}. 

Then \K^\ ^ \H\/\K\. 

Proof. Let (f : H ^ K he the homomorphism (j){y){x) := Mx ■ y. Then is precisely 
the kernel of 0. But since 0 is a homomorphism with a domain of size \H\ and a range 
of size at most \K\ = \K\, the claim follows. □ 

Lemma 11.4 (Gauss sum lemma). Let K he any finite group with {Q,K) = 1, and 
suppose that every non-zero element of K has order at most t for some t > 2. Let 
M : K K be a self-adjoint homomorphism. Then, if \K\ ^ there exists a 

non-zero element x G i^\{0} such that Mx ■ x = 0. 


Proof. We can assume that M is injective (and hence bijective), since otherwise we can 
just set X to equal a non-zero element in the kernel of M. Using the classihcation of 
hnite abelian groups, we can write K as the direct sum of cyclic groups of odd prime 
power order. Note that if p is a prime such that at least three cyclic groups of order 
equal to a power of p appear in this direct sum, then K contains a subgroup isomorphic 
to Fp. Restricting M to (note that M will still be self-adjoint) and applying Lemma 
o we can then conclude the existence of a non-zero x E K such that Mx ■ x = h. Thus 
we may assume that for each p ^ 5 there are at most two cyclic groups of order equal 
to a power of p in the direct sum decomposition of K, in which case we may write 

K = X (Z/pJ^Z) 

for some distinct primes pi,... ,pk and exponents Uj, u'j. Let n 7^ 0 be any integer, and 
dehne 

n~^ := {x ^ K : nx = 0}. 


Writing 


n = (-l)>i 


Vl 


Vu 

■Pk ^ 


one conhrms the estimate 


k 

= ]^min(p“bpj'’) iHin(pJbp7) ^ (H-^) 


Let X : —>• R be a smooth bump function such that x('S) = 1 when ||s||ffi/z < l/2f, 

x(-s) = 0 when ||s||]r/z ^ 1/t, and for which the derivative estimate ||x'‘"||oo ^ 100/f^ 
holds true. Observe that if Mx ■ x is non-zero, then \\Mx ■ tHr/z ^ 1/t since x (and 
hence Mx ■ x) has order at most t. Thus 


^x(iK^Mx.x=0 ' ^)' 


Expanding in a Fourier series gives 

E^eAlMx.x=o = 'Yl x{n)'^xe{nMx • x) 


where 


X{n) 


/ X{s)e{-ns) ds. 
Jr/z 


Isolating the term n = 0 we obtain the inequality 

lEa:eAlMa:.x=o - £(0)| ^ ^ \x{n)\\E,,e{nMx ■ x)\. 

nez\{0} 
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Now by iH), Fourier inversion, the injectivity and self-adjointness of M, and the hy¬ 
pothesis that \K\ is odd, we have 

\¥,xe{nMx ■ x)\ ^ \¥.h^xe{nM{x + h) ■ {x + h) — nMx ■ x)\^^'^ 

< {¥.h\^^e{2nMx-h)\f/‘^ 


— (lEhlM2nh=o)^'^^ — {^hXnh=oY^'^ 



1/2 




V^’ 


the last estimate following from (ITT3D . It follows that 


|lExeAlMa;.x=0 - 2(0)1 ^ I2(’^) I 1^1 (H-O) 

ngZ\{0} 

Now by integrating by parts three times and using the bound on ||x'"||oo one sees that 
|2 (?t,)| ^ In combination with the trivial bound ||2||oo ^ 2/f, we obtain 


riT^O |n|<£ 


Thus, since \K\^ lOOf^, (lll.bj) implies that 


E, 


xGK^Mx-x=0 ^ 


1 1 
M ^ IkI' 


which immediately implies the result. 


□ 


Corollary 11.5. Let H be a finite additive group, (|Fr|, 6 ) = 1, such that every element 
has order at most t, t ^ 2, and let M : H ^ H be a self-adjoint homomorphism. Then 
there exists a subgroup K of H such that 

\K\ ^ (11.7) 

and Mx ■ y = 0 for all x,y & K. 


Proof. Let iF be a subgroup of H on which the quadratic form Mx ■ y vanishes (i.e. 
Mx-y = 0 for all x,y ^ K), and which is maximal with respect to set inclusion. Observe 
that the orthogonal complement of K contains K, and hence by Lemma 111.51 the 
quotient group /K has cardinality at least |i7|/|iLp. Also, every element in this 
group has order at most t. Since Mx ■ y = My ■ x = 0 whenever x E K and y G we 
see that the bilinear form Mx-y descends to a bilinear form on /K. If the associated 
quadratic form vanished for at least one non-zero element of /K, then by adjoining 
this element to K we conld contradict the maximality of K. Thus we may assume that 
there is no such form. But then by Lemma fl 1.41 we have \K^/K\ ^ lOOf^. Combining 
this with our lower bound for \K^/K\ we obtain the result. □ 


Let us return now to the sitnation (nn, and let K be the subgroup obtained by the 
above Corollary. By Lemma fl.ll we have 

\¥.y(.H'^x&y+KfH{x)e{-Mx ■ x)e{-f ■x)\'^ 8|a' -a\P 5^/2. 

Setting a"{ii) := the triangle inequality implies that either 

'Ky^nW{y) — a'l ^ 2\a' — a\ 5^ 112 


( 11 . 8 ) 
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or 

\E^ey+KfH{x)e{-Mx ■ x)e{-^ - a;)! - S\a"{y) -a’\) ^ 2\a’ - a| + jL (11.9) 
Suppose that holds. From Lemma mu we have 

Ey^H{(y”{y) - a) = 0 

and hence 

2EyizH max{a''{y) — a', 0) = Ey\a''{y) — a \ 2\a — a\ + 5^/12. 

By the pigeonhole principle we thus conclude that there exists y E H such that 

2{a\y) - a) ^ 2 |a - a| + 5^/12, 
and hence by the triangle inequality 

E^(,y+K^A{x) = a"{y) ^ a'+\a'-a\+5^/24: ^ a+5'^/24 = E^^g^a{x)+5^/2A. (11.10) 
This, together with the lower bound 

\K\ ^ 

(cf. flll.7j) i implies the proposition under the assumption that (jll.Hjl holds. 

Suppose, then, that (HinD holds instead. By the pigeonhole principle, we can hnd y E H 
such that 

\Ex(zy+KfH{x)e{—Mx ■ x)e{—^ ' x)\ ^ 3\a"{y) — a\ + 2\a — «! + (5‘"/4. 

Splitting fn as (1 a — + {<^"{y) ~ cn') and using the triangle inequality, we conclude 

|Ej, 6 j^+x((lA(a:) - a"{y))e{-Mx ■ x)e{-i ■ x)\ ^ 2|a"(i/) -a\+ <5*^/4. 

We write x = y + z and use the fact that the bilinear form My ■ z is symmetric and 
vanishes on K to conclude that 

|EzeA((lA(l/ + z) - a”{y))e{-{2My + 0'2^)1 ^ 2|a"(i/) - a] + 5^/A. 

Consider now the homomorphism 0 : iC —>■ M/Z dehned by 0(a;) := {2My + ^) ■ x. 
A simple pigeonhole argument shows that there exist at least elements x 

of K for which ||0 (x)||r/z < But since every element of K has order at most 

gViog V, conclude that (j){x) = 0. Thus if we set K' to be the kernel of 0, then K' is 
a subgroup of K with 

\K'\^\K\e-^^^ > ( 11 . 11 ) 

We then apply Lemma 14.11 again to conclude that 

\E^eAew+K'ii^Aiy + z)- a''{y))e{-{2My + 0-^)1 ^ ‘^W'iy) - «! + 

Since the phase e{—{2My + .^) ■ 2 ;) is constant for z E w + K, for hxed w, we conclude 
Eu,eA|E^e^+x'(lA(l/ + z) - a”{y)) \ ^ 2\a”{y) - a] + 6^/4 
and thus, writing a"'{t) := 

E^gx|a'"(w + y) - a''{y)\ ^ 2\a''{y) - a| + 5^/4. 

Now from Lemma mu we have 

+ y) - a"{y)) = 0. 


( 11 . 12 ) 
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Together with (I11.12j) this implies that 

2E,w(zk TLn.ax{a"' {w + y) — C(''{y), 0) ^ 2\a''{y) — a\+ (5‘"/4, 
and so there exists w & K such that 

a'"{w + y) — a"{y) ^ \ci"{y) — a\-\r (5‘"/8. 

One hnal application of the triangle inequality gives at last that 

¥.^(.w+y+K'lA{x) = a"{w + y)^ a + (5*^/8 = Ea:6GlA(x) + 6^/S. 

Together with the lower bound (jll .1 Ijl . this concludes the proof of Proposition 11 1 .21 □ 

Proof of Theorem Let G be an abelian group, and let A C G be a set with 

cardinality at least 5N which contains no 4 distinct elements in arithmetic progression. 
We wish to show that 5 <C (loglog for some absolute constant c; thus we may 

certainly suppose that 5 ^ 2/log log iV. 

Suppose that G has an element of order greater than exp ((log Writing TT = (^f), 

we see that there is some coset y + H such that ^ 6. The result of Gowers 

pH] then immediately implies that 

6 < (loglog \H\y < (loglog 

Suppose, then, that all elements of G have order at most exp((logiV)^/^). We will dehne 
a sequence G = Go ^ Gi ^ . of subgroups of G with cardinalities N = No ^ Ni ^ 

.... The sequence will be dehned in such a way that 

A(,-^ exp((logiV)2/3), (11.13) 

which means that no element in Gj has order greater than exp((logiV,)^/^), and also 
that 6 ^ l/loglogiVj. This is to enable us to apply Proposition II 1.21 

Suppose that we have dehned Gj. For any Xj, the set {xj + A) fl Gj does not contain a 
4-term arithmetic progression. Suppose that Xj is such that W^^^xj+Gj^A^x) ^ 5. Then, 

applying Proposition I11.2L we see that there is some G^+i, := |Gj_|_i| ^ i 

together with some Xj+i so that 

^xdXj+i+Gj+i^Aix) ^ ^x(iXj+Gj^A{x) T . 

Iterating this construction leads to a contradiction for some j ^ unless flll.l8|l is 
violated. Since Nj ^ , we must therefore have 

7V(V8)""'' < exp((log iV)2/3), 

which implies the required bound 5 -C (log log iV)”*^. □ 

Remarks. We hope to prove a bound of the form r 4 (G) -C iV(logA^)“'^ in a future 
paper by combining the ideas of issi with nested Bohr set technology in the spirit of 
that used in ^ and m of the present paper. These methods were hrst introduced by 
Bourgain nm, who obtained the bound r 3 (G) <C iV(logA^) 1 / 2 +*^, which is still the best 
currently known when G = Z/iVZ. A feature of this approach is that, unlike in the 
present section, it is no easier to deal with Z/A^Z than it is with an arbitrary abelian 
G. We note that the celebrated Erdos-Turan conjecture [ini is roughly equivalent to a 
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bound of the form rfc(Z/iVZ) <Ca: iV(logiV) and so even in the case k = 3 there is 
an awful lot left to be done. 

12. An ergodic theory interpretation 

We now connect the inverse theorems discussed earlier to ergodic theory, and in 
particular to the recent work of Host-Kra mi and Ziegler |S|. 

Dehne a measure-preserving system (A, B, T, P) to be a probability space (A, B, P) with 
an invertible measure-preserving (i.e. probability-preserving) shift operator T : A —> 
A. This induces a shift operator T on random variables / : A —> M by the formula 
Tf{x) := f{T~^x), and more generally T^f{x) := f{T~'^x) for any n G Z. We use 
^x{f) to denote the expectation of /. 

The Furstenherg correspondence principle (see e.g. lEI) equates combinatorial theorems 
such as Szemeredi’s theorem to recurrence results in ergodic theory. In particular, 
Theorem 1 1.1 1 is logically equivalent (using the axiom of choice) to the following theorem. 

Theorem 12.1 (Furstenberg recurrence theorem j1 Hll^ h Let (A, B, T, P) be a measure¬ 
preserving system, and let f G L°°{X,B) he any non-negative random variable with 
Ex/ > 0. Then for every k ^ 1 we have 

lim inf E_ 

N^OO 

In particular, if A E B is any event with positive probability P(A) > 0, then 
lim inf E_ n^u^nHA n T^A n... n > o. 

N^oo 

Recently, it was shown in HU and [HH] that this limit inferior can in fact be replaced 
by a limit; earlier work related to the k = A case can be found in mi El EHi mi. 
The two approaches are slightly different; the argument in m proceeds by establishing 
the ergodic theory analogue of an inverse theorem for the Gowers uniformity norm 
17'^(Z/AZ). Indeed, if / G L“(A, B) is any complex-valued random variable, dehne the 
quantity ||/||i 7 d(T) for d ^ 0 by the formula 

we{o,i}‘* 

It can be shown muni that this limit actually exists, and it can also be shown that 
||/||t/d('r) is in fact a semi-norm on bounded random variables for any d ^ 1; see [41 j . 
This semi-norm is clearly related to the U^{G) norms dehned in Dehnition II .til For 
instance, if T is periodic of order N then it is easy to see that the U'^{T) norm of f{x) 
is the average of the U^ifL/NT) norm of / restricted to the orbits of T: 

ll/llc/‘*(T) =Ea,||(T f{x))he.'&/N'L\\ud{Z/N7.)- 

Here of course we take advantage of the periodicity of T to dehne for h G Z/AZ in 
the obvious manner. 

In mi it was observed that this semi-norm controls expressions such as those appearing 
in the Furstenberg recurrence theorem. Indeed, there is an analogue of Proposition 11.71 
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which asserts that if /o,..., fk-i are bounded random variables and at least one of them 
has vanishing norm, then 

lim = 0; 

A—>oo 

in fact one can make the slightly stronger claim that E_Ar^„^ArT’^/i... con¬ 

verges to zero in (for instance) the L‘^{X) sense. Informally, this fact shows that func¬ 
tions with vanishing U^~‘^{T) norm are irrelevant for understanding /c-fold recurrence. 

It is thus of interest to determine when the U^~‘^{T) norm is positive. This question 
is answered in jH] using the language of nilsystems. We hrst recall some notation. If 
G is a (not necessarily abelian) group written multiplicatively and if g,h G G, we let 
[g, h] := g~^h~^gh denote the commutator of g and h. If G' and G" are subgroups of 
G, we let [G',G''] = [G'\G'] be the subgroup generated by the commutators {[g',g''] : 
g' G G\g" G G"}. We then define the lower central series 

G = Gi D Gs 5 Gs 5 ... 

of subgroups of G by the recursive definition Gi := G; G^+i := [G, G^]. We say that G 
is {k — 2)-step nilpotent for some A; ^ 2 if G^-i is trivial. Thus for instance a group is 
1-step nilpotent if and only if it is abelian. 

A {k — 2)-step nilmanifold is defined to be a manifold of the form G/T := {a:r : x G G}, 
where G is a finite-dimensional nilpotent Lie group, and T is a discrete subgroup of G 
which is co-compact (i.e. the nilmanifold G/T is compact). Note that we do not assume 
T to be normal, and hence a nilmanifold need not have a group structure. It is however 
a compact symmetric space, with a left-action of the group G. Thus there is a unique 
invariant Haar measure P on a nilmanifold, which we normalize to be a probability 
measure. Thus every nilmanifold is a probability space, taking the a-algebra to be the 
Borel (j-algebra. 

If (7 G G then we write Tg for the shift operator from G to itself defined by Tg{x) = gx, 
and also (by abuse of notation) for the map from G/T to itself dehned by Tg[xV) : = 
gxT. This latter map is measure-preserving and invertible. Let us call a (/c — 2)-step 
nilmanifold with one of these shift operators a (fc — 2)-step nilflow. A [k — 2)-step 
nilfunction is defined to be any continuous function F : G/T —>■ C on a (A; — 2)-step 
nilmanifold; given such a nilfunction, a point Xq G G/T and a group element g E G, we 
dehne the associated basic {k — 2)-step nilsequence Fg^^o : Z —>• C by the formula^® 

Fg,xo{n) := F{TgXo) for all n E Z. 

We can truncate this to ZjNZ and define the truncated nilsequence Fjq^g^^o '■ ZfNZ —>• C 
by the formula 

FN,g,xo{n) := Fg^^g{n) = F(TgXo) for all - N/2 <n ^ N/2 
where we identify the integers from —N/2 to N/2 with Z/NZ in the usual manner. 

We now give three key examples of nilflows and nilsequences. 

^^A general [k — 2)-step nilsequence is defined as the uniform limit of basic (k — 2)-step sequences; 
see |3] for further analysis of these nilsequences. 



AN INVERSE THEOREM FOR THE GOWERS U^{G) NORM 


57 


Example 12.2 (The circle nilflow). Let G be the one-dimensional matrix group 


G : = 





: T e M} 


and let T be the discrete subgroup 



; n G Z}. 


Then G/T is a 1-step nilmanifold (and hence also a 2-step nilmanifold), indeed we 
can easily identify it with the unit circle R/Z. A shift Tg on this nilmanifold then 
corresponds to a simple translation x x + a, where a G R is the upper right matrix 
entry of g. In particular, we observe that if F : R/Z —C is any function, we see that 
the sequence 

n I—>• F{TgX) = F{x -I- no;) 

is a basic 1-step nilsequence. Thus, for instance, the linear phase function n h-> e{na) 
is a basic 1-step nilsequence (and hence also a 2-step nilsequence). More generally, any 
quasiperiodic sequence is a basic 1-step nilsequence, and any almost periodic sequence 
can be expressed as the uniform limit of basic 1-step nilsequences. 


Example 12.3 (The skew shift nilflow). Now we consider the example 

/1ZR\ /1ZZ\ 

G := 0 1 R ; T ;= 0 1 Z . 

\001/ \001/ 

Then G/T is a 2-step nilmanifold, and one can identify it topologically with the 2-torus 
(R/Z)^ by the identihcation 

/ 1 0 y \ 

{x,y)= \ 0 1 X T. 

\0 0 1 J 

If we let 

/ 1 m /3 \ 

5( ;= 0 1 a 

\ 0 0 1 / 

be a typical element of G (thus m G Z and a,/? G R) then the shift Tg is then given 
by {x,y) I—> {x + a,y + P + mx), and thus if F : (R/Z)^ — C is any function then the 
sequence 

n H-> F(Tg{x, y)) = F{x + na, y + nP + \mn{n -|- 1)q;) 

is a basic 2-step nilsequence. Thus, for instance, the quadratic phase function n h-> 
e{\an{n + 1)) is a basic 2-step nilsequence. More generally, any quadratic phase n h-> 
e{arP + Pn + '-^), or hnite linear combination of such phases, is a basic 2-step nilsequence. 

Example 12.4 (The Heisenberg nilflow). Now we consider the example 

/1RR\ /1ZZ\ 

G ;= 0 1 R ; T ;= 0 1 Z . 

\001/ \001/ 
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Then G/T is a 2-step nilmanifold. By using the identihcation 

( ^ z y\ 

{x,y,z)=\ 0 1 X r, 

\0 0 1 J 

we can identify G/T (as a set) with quotiented out by the equivalence relations 

{x, y, z) {x + a, y + b + az, z + c) for all a,b,cG Z. 

This can in turn be coordinatized by the cylinder [—1/2,1/2] x (M/Z)^ with the iden¬ 
tihcation (—1/2, y,z) ~ {1/2, y + z, z). 

Let F : G/T — C be a function. We may lift this to a function F : G ^ C, dehned by 
F{g) := F{gT). In coordinates, this lift takes the form 

F{x, y, z) = F{{x}, y — [a;] 2 :(mod 1), ; 2 (mod 1)) 

where [x] = x — {t} is the nearest integer to x (we round half-integers up). If we let 

/ 1 7 /3 
;= j 0 1 a 
\ 0 0 1 

be an element of G, then the shift Tg\ G ^ G Is, given by 

Tg{^, y,z) = {xFa,y-TiiF-ix,zF-i), 
from which a short induction conhrms that 

Tg{x, y, z) = {x + na, y + nP + ^n{n + l)a, z + ny). 

Therefore if F : G/T G/T is any function, written as a function F : [—|, |] x 
(R/Z)^ — C with F(—1/2, y, z) = F{l/2, y + z,z), then we have 

F{Tg{x, y, z)) = F{{x + na}, y + nP + ^n{n -|- 1)q;7 —[x + na\{z + ny), z + ny). 

This, of course, is a basic 2-step nilsequence. We see, for instance, that the generalized 
quadratic phase function n h->• e{\n{n + T)ay — [na]ny) is a basic 2-step nilsequence. 

We call the three basic examples just discussed the fundamental 2-step nilsequences. 
They may be used in a straightforward product construction to construct further nilse¬ 
quences, as we now describe. 

If {G/T,Tg) and {G'/T,Tg/) are 2-step nilflows, then so is the direct sum ((G©G')/(r © 
r'),T(gy)). Also, if F : G/T — C and F' : G'/T' C are functions, and we dehne the 
tensor product F © F' : (G © G')/(r © T') in the usual manner as 

F(^F'{x,x') := F{x)F'{x') 

then we see that the function F © F'{T/^^ ^,^{x,x')) factors as 

F © F'(T(-^,)(a;, F)) = F(T;a;)F'(TX). 

Now suppose we take ni nilsequences coming from circle nilflows, n 2 nilsequences coming 
from skew shift nilflows, and ns nilsequences coming from Heisenberg nilflows, and tensor 
them all together. What results is a nilsequence on a 2-step nilmanifold G/T, which 
is topologically the cube [—|, i]^i+2«2+3n3 faces identihed. 2-step nilsequences of 
this type, that is to say tensor products of fundamental nilsequences, are in a sense 
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the only important ones if one is interested in the Gowers norm. We call them 
the elementary 2-step nilsequences (we also refer to elementary 2-step nilmanifolds and 
elementary 2-step nilflows). 

Given an elementary 2-step nilsequence, it is natural to refer to ni -|- 2n2 as 

its dimension. It is also of interest to have a notion of how continuous the underlying 
function F ■. G/T G/T is. We adopt a rather low-brow approach to this concept which 
is sufficient for our purposes. Let d = 1/m be the reciprocal of an integer. Then we may 
divide (—|, into cubes of sidelength 5, for any d, and in fact these subdivisions 
respect the quotienting of Examples 112.21112.31 and 112.41 (for d = 1,2,3 respectively), 
giving what we refer to as 6-nets on the three fundamental 2-step nilmanifolds G/T. 
We refer to the building blocks of these nets as the (5-atoms. Taking products, we 
may obtain (5-atoms and a (5-net on any elementary 2-step nilmanifold G/T. Finally, if 
F : G/r —> C is a function and if iL > 0 is a constant, we say that F is iL-Lipschitz if 
for all (5 = 1/m and for all (5-atoms A we have 

\F{x)-F{x')\ ^ K6 

whenever x, x' G A. The following lemma is straightforward. 

Lemma 12.5. Suppose that Fi : Gj/Tj —> i = 1,..., fc, are K-Lipschitz functions 
on elementary 2-step nilmanifolds Gj/Tj. Then the tensor product ® ® Tfc is 

Kk-Lipschitz. □ 

Now it has been known since the work of Furstenberg and Weiss [21] that random 
variables on a (fc — 2)-step nilflow (G/T, Tg) have a non-trivial behavior with respect to 
fc-term recurrence. Indeed, given any bounded random variable /o : G/T —>■ C which is 
not identically zero, one can End fi,..., fk-i such that the averages 

do not converge to zero; this is basically due to non-trivial algebraic relations between 
the k points xT, g'^xT, ..., g^^~^'>^xT. In particular, the norm is non-degenerate 

on this nilmanifold (and is thus a genuine norm); see jUUHS] for some further discussion 
of this fact. We will prove a variant of this statement. 

Proposition 12.6 (Nilsequences obstruct uniformity). Let k ^ 3, and let (G/r,T) be 
a {k — 2)-step nilsystem. Let F : G/T C be a continuous function on G/T which is 
not identically zero. Suppose that N > k — 1 is a prime, and that f : Z/iVZ —>■ V is a 
function such that 

\E_N/ 2 ^n^N/ 2 f{n)F{T/fx) \ ^ g. 

Then we have 

11/11(73 ^ CF,G/r{v) > 0 
uniformly in x & G/T, g E G and N. 

Proof. In proving this proposition we will use the following lemma to the effect that the 
point is completely constrained by the cosets T^xT, T^+^xT,..., xT. 

Lemma 12.7. Let {G/T,T) be a {k — 2)-step nilsystem. Then there is a compact set 
S C (G/r)^“^ and a continuous function P : T, ^ G such that for all n,r E T^, g E G 
and X E G/T we have 

(T^x, T^+'x,..., T/^+(^-2 )"x) G S 
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and 


P(T^X = rj,n+{k-l)r 

The existence of a constraint of the type is discussed in several places in the ergodic 
theory literature 0 na EH El- For the convenience of the reader we supply a self- 
contained proof in Appendix fT^ 

The proof of Proposition 11 2. til is not dissimilar to that of Theorem 12.71 (ii), which was 
given at the start of ^ but here we use Tjemma Tl 2. 71 in place of ra, and the technical 
details are rather different. 

Assume without loss of generality that Halloo ^ 1, and let £ ^ 1/lOOfc be a small 
constant (it will be chosen to be a small multiple of rj). Observe that the function 
F o F : S —C is continuous, hence uniformly continuous, on the compact set S. In 
particular we can hnd a neighbourhood F C G of the identity 1 which depends on e, 
F, G/T such that 


F(F(xo,..., Xk- 2 )) = F{P{yQ ,..., yk- 2 )) + 0{e) (12.1) 

whenever (xq, ..., Xk- 2 ), {vo, ■ ■ ■, yk- 2 ) £ S and Xj G Vyj for all 0 ^ j ^ k — 2. Applying 
Lemma Tn\ and exploiting compactness again, we conclude that there exists another 
neighbourhood F' C 1/ of the identity 1 such that given any 2 : 1 ,..., Zk -2 £ G/V, there 
exists a bounded function Qzi,...,zk -2 • G/T —> V such that 

F(T;+(^-i)’'x) = Qz,,...,z,_ATgX) + 0{€) whenever T^+^'x G V'zj for all 1 ^ ^ fc - 2. 

It is not hard to ensure that the function Qzi,...,zk -2 depends in a measurable manner 
on zi,..., Zk- 2 - In particular we see that 


k-2 

+ {k- l)r) n 

J=1 

k-2 

= Q.. .+ (*; - 1 )--) n 

i=i 

k-2 

+0(£F(T;+(^-'>x)/(n +{k- l)r) J] 1 v,,^.(T;+^''x).(12.2) 

for all n, r G Z, G G and x G G/T. We are going to average this over n and 
r, but for technical reasons related to the difference between {n : |n| ^ Ff/2} and 
Z/NZ as additive objects, we shall restrict the range of r. To this end, take a function 
X : Tj/NTj —>• such that Erx(r) = 1, Supp(x) C [—eA^, eA^] and for which we have 

the Fourier estimate 


|x(m)| ^ 100/e. 

meZ/AZ 


(12.3) 





AN INVERSE THEOREM FOR THE GOWERS U^{G) NORM 


61 


Such a function can easily be constructed, for example by convolving an interval with 
itself. Averaging (ITT^ over |?T,| ^ N/2 — ekN and over r weighted by y, one obtains 


k-2 


j=l 

k-2 

= ^n\^N/ 2 - 6 kNKez/NzX{r)h{n)f{n + {k - l)r) JJ + jr) 

i=i 

k-2 

+0(£ni'"-.Vr'V)), (12.4) 

i=i 


where 


h{n) := and hj{n) := lv'zj{T^x). (12.5) 

Note that as a consequence of the restrictions we have made on the support of n and 
of r, the expressions n + jr, j = 0,1,..., fc — 1 are the same whether we regard the 
addition as taking place in Z or in Z/A^Z. In particular, (j12.4jl remains valid if one 
imagines that these additions are made in Z/A^Z. 

Now the second line of (imi can be written as 

k-2 

'^\n\^N/ 2 '^r&z/Ni.x{'r)h{n)f{n +{k- l)r) JJ hj{n + jr), 

i=i 


where 

h{n) := (1 - 2 ek)~^h{n)lin\^N/ 2 -ekN, 

and in particular ||h||oo ^ 2. Writing y(r) in terms of its Fourier transform on Z/A^Z and 
using Proposition II.7L we can bound this expression above as follows, where eAr(r) : = 
e{r/N): 


k-2 

E|n|^A/ 2 lErez/Azy(F)h(n)/(n +{k- l)r) hj{n + jr) 

i=i 




k-2 


x{m)eN{—mr)h{n)f{n + {k — l)r) hj{n + jr) 

j=l 


— |lE.„gz/ArzEr6Z/Arz E x{m)h{n)eN{'mn)hi{n + r)eM{—xn{n + r)) x 

m^'LlN'L 


k-2 

^ n + 2^) ■ /(^ + {k — l)r) 

j=2 


« 2 !£( 

m&'E/N'E 


m 


Uk-i 




200 
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Now we substitute this into (ITT3D and average over zi,, Zk -2 (picking these k — 2 
elements uniformly according to the Haar measure P on G/T). This yields 


|lE.|n|^Ar/2-eA;AlE.r x(r)F(T;+(^-i>a;)/(n+(fc-l)r)| ^ 


200 


jjk-i 


+ 0{e). 

( 12 . 6 ) 


£P(7r(l/'))^“^ 

Write G{n) := F{TgX)f{n). Then UGHoo ^ 1, and so we have 

^\n\-S^N/2—6kN^rX,i,'^')G{^^ T (fc 1)^) ^rX,i^^')^—N/2+£kN—(k—l)r^n'i^N/2—ekN—(k—l)rG(^n ) 

= Erx{r) (E|n|^A/ 2 G(n') + 0(e:)) 

= lE|„|^Ar/2G'(u) + 0(e). 

Comparing this with (ITtI) gives 

200 


FiH^A/2/(n)F(r;a:)| ^ 


^k-l 


■3“'^' " eP(7r(W))^-2 
which implies the result ii e = cr] for c = Ck sufficiently small. 


+ 0(e), 


□ 


Remark. An alternative way to obtain this lemma is to establish that the {k — 2)-step 
nilsequence n i—>■ F{TgX) can be approximated to high accuracy by a function which 
is uniformly almost periodic of order fc — 2 in the sense of inn; this approach has the 
advantage of not requiring an explicit algebraic constraint such as that given in Lemma 
imi but we do not pursue it here. This approach corresponds closely to the observation 
that a fc — 2-step nilflow can be constructed as a tower of fc — 2 compact extensions of 
the trivial measure-preserving system, see 12111111112] for further discussion. 

Proposition 112.61 shows (essentially) that the basic {k — 2)-step nilsequences form “ob¬ 
structions to quadratic uniformity”, in the sense that functions / : Z/A^Z —V which 
have a large inner product with such functions cannot have small U^~^{7j/NX) norm. 
The remarkable result of Host and Kra m asserts, roughly speaking, that these are 
in fact the only obstructions to having small norm. More precisely, they work in 
the inhnitary setting of arbitrary measure-preserving systems (as opposed to the shift 
on ’L/N'L) and show that this system contains as an invariant factor an inverse limit of 
{k — 2)-step nilflows, such that the norm vanishes on the orthogonal comple¬ 

ment of this inverse limit. In particular, this inverse limit is a characteristic factor for 
the norm, and for all quantities controlled by this norm, including the /c-term 

recurrence expressions appearing for instance in the Furstenberg recurrence theorem; 
this fact is crucial in establishing the convergence of these recurrence expressions. We 
remark that the work of Ziegler unEni achieves a very similar result, but avoids use 
of the norm and obtains a characteristic factor (and convergence results) for 

the recurrence expressions directly. Also, the subsequent work of Bergelson, Host, and 
Kra j2| gives a further discussion of the connection between the norm and 

{k — 2)-step nilsequences. 

We now use Theorem 11 0.91 to obtain a hnitary (and reasonably quantitiative) version of 
the Host-Kra theorem in the case fc = 4, with very explicit nilsequences; in fact, they 
will be none other than the elementary 2-step nilsequences dehned earlier. 

Theorem 12.8 (Inverse theorem for f/^(Z/A^Z), elementary nilsequence version). Let 
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N > 2 be a prime, letQ < p be sujficiently small, and suppose that f : 1,1 ^ V is 

a function with \\f\\u^(z/NZ) ^ V- Then there exists an elementary 2-step nilsystem G/T 
of dimension ^ an ri~^-Lipschitz function F : G/T T), and elements g E G, 

Xq G G/T and h G 1/N1 such that 

|En6Z/Afz(7'^/-FN,5,a:o)| ^ exp(—?7 

Here, we define FN,g,xo ■ T,/H1 C by 

FN, 9 ,xo{n) := FiTgXo) for all - N/2 <n< N/2. 

Remarks. The function -F/v,g,a:o ^ 2-step nilsequence, adapted to 1/N1. The 

analogous theorem for U‘^{1/N1) is a trivial consequence of Proposition 12.21 the only 
linear nilfunction that needs to be considered is the function F{x) := e{x) on the unit 
circle M/Z from Example I12.2L with h = 0 and Xq = 0. A modihcation of Example 
12.41 can be used to show that in formulating this theorem we must take into account 
Example 112.41 the other two fundamental 2-step nilsystems are in fact embedded inside 
this one and one could have dispensed with them altogether, but we have kept them for 
expository purposes. One could also easily eliminate the role of xq (which is harmless 
anyway, since it ranges over a compact set) and of the shift h, but the parameter g 
ranges over a genuinely non-compact set and cannot be eliminated from this theorem 
(this can be seen even in the linear case; the frequency ^ in Proposition 12.21 is not 
restricted to a bounded set of values independently of N). 

Proof. Applying Theorem IIP.hi (and Tvemma l8.2jl . we obtain a set S C 1/N1 with 
d := [S'! ^ r]~'", a regular Bohr set B = B{S,p) with p ^ r;*", and a bracket quadratic 

G«'65 

with Freq(0) C S such that 

|E„eB(T'*/(I^)e(-0(r^)))| ^ 

and thus 

\En(.z/NziT^fin)e{-(j){n))lB{n))\ ( 12 . 8 ) 

Now let £ := p'^/iOOd, and let y : R/Z —> [0,1] be a continuous function such that 
x{x) = 1 when |a;| < p(l — e) and y(x) = 0 when 1/2 ^ |x| > p(l -|- e). Consider the 
function^® 

«6S 

It is supported on B{S,p{l + e)) \ B{S,p{l — e)), which by the regularity of B has 
cardinality no more than 200de\B\. Thus fll 2.8j) implies that 

|E„r'*/(n)e(-0(n)) ■ n)| ^ - 200(ie)ElB ^ ^ exp{-p~'^), 

«65 

the latter inequality being a consequence of Lemma 18.11 

^®The large power of y here is so that we can distribute the cutoff y among various factors later. 
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Expanding ont 0 as in (ITT7I) . we see that onr task is to show that the fnnction 

( n • ’^})) ( n (12.9) 

is an elementary 2-step nilseqnence, for which it is enongh to handle each of the fnnctions 
in the prodnct separately as in the following lemma. 

Lemma 12.9. Each of the individual functions 

n ^ X{^ ■ n)e{a^{f ■ n}) (12.10) 

and 

n ^ x^{i ■ n)x^{i' ■ n)e{a^,^:{f ■ n}{^' ■ n}) (12.11) 

can he written as an elementary 2-step nilsequence with dimension no more than 9 and 
Lipschitz constant at most 50. 

Proof. We begin by considering the fnnctions ()12.10j) . which are easier (corresponding 
to linear nilcharacters rather than qnadratic ones). Split = q s, where q is an 
integer and |s| ^ 1/2, and observe that e{q{f ■ n}) = e{qfn) if we identify Z/iVZ and 

Z/A^Z with the integers from —N/2 to N/2. Tims the fnnction (112.lOj) takes the form 

n ^ x{^n/N)e{s{fn/N})e{qfn/N). 

This fnnction may be identihed as the elementary nilseqnence F]s[^{^/N,q^/N),o, where the 
nnderlying nilmanifold G/T is the direct snm of two copies of the nnit circle shift (i.e. 
it is the torns (M/Z)^) and F : (M/Z)^ —>■ C is the fnnction 

Fix,y) := xix)e{sx)e{y) 

where we identify x G M/Z with a real nnmber from —1/2 to 1/2 in the nsnal manner. 
It is not hard to check that F is 50-Lipschitz. 

Now consider the fnnctions p2.11jl . We split as q s, mnch as before, so that 
(fTTTT]) becomes 

n I—>• x(cm)^x( 7 ^)^c(s{cm}{ 7 n})e(g{an}{ 7 n}), 

where 7 := f/N and 7 := ^'/N. Observe that since {an} = an — [an] and {7^,} = 
777. — [777,] , we have the identity 

q{an}{xn} = qajn'^ — ga77[777] — qxn[an] -f- q[an]['yn]. 

The last term is an integer, and hence 

e(g{a77}{77i}) = e(ga777^)e(—ga7r[77r])e(—g777[a77,]). 

Thns it snffices to exhibit the three fnnctions 

77 I—> x{c(n)x{'yn)e{s{an}{'yn}) 

77 I— e{qa'yn^) 

n I—>• x(77r)e(—ga77[777]) , n h->• x(a7r)e(—g77r[a7r]) 

as elementary 2-step nilseqnences (note that the last two fnnctions are essentially the 
same). 
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The first function can be obtained from a direct sum of two copies of the unit circle shift 
fExample 112 .2|1 by repeating the analysis of (112.lOD . with F now dehned by F{x, y) : = 
x{.^)x{y)^{^xy) when —1/2 < x,y ^ 1/2. 

The second function can easily be obtained from the skew shift (Example 112 .djl . by 
writing 

2 n(n + 1) 

qa'jn = —qa'yn + 2 ga 7 - - -, 

and then taking F{x,y) := e{y), xq = (0,0), and 

-qaj \ 

2qa'y 

1 / 

Finally let us consider the third function. We write 

e(—g 7 n[cm]) = e(|n(n + l)ag 7 — [na]ng 7 )e(—|n(n + l)ag 7 ). 

The second factor can be generated using the skew shift as before^^. We are thus left 
with 

,n(n + l) 

X[otn)e[ - - -ag7 - [na\nq'y). 

But this can be generated from the Heisenberg shift 1 Example 112.4j) with xq = (0, 0, 0), 
F{x,y,z) ■=x{x)e{y) on [-1/2,1/2] x (M/Z)2, and 

/ 1 0 g7 \ 

^ := 0 1 a . 

Vo 0 1 ; 

It is easy to check that all of the functions F used in these constructions are 27r-Lipschitz, 
a bound which together with Lemma fl2.5l completes the proof of the lemma. □ 

Tjemma fi 2.bl together with another application of TjCmma fl 2.51 conhrms that the func¬ 
tion is an elementary 2-step nilfunction with dimension at most 18(P and Lischitz 

constant no more than lOOd^. This completes the proof of Theorem 112.81 □ 

Remark. A pleasant reformulation of Theorem 112.81 may be obtained by considering 
the (5-atoms of G/T. Suppose that F : G/T —>• (D is iL-Lipschitz. Let (A^)^gn be the 
5-atoms of G/T, pick arbitrary points x^ G for each cn, and write 

F(n) 

Since F is iL-Lipschitz we clearly have the bound ||E — F|| ^ K5. Taking 5 ^ 
exp(— 7 “*") we may replace the conclusion of Theorem 112.81 bv 

|E„ez/7Vz(T^/-Fv,s,xo)l ^ |exp(-? 7 “^). 

^'^Indeed we could simply have rewritten our factorization of e{q{an}{'jn}) to incorporate these 
factors (and a linear phase correction), so as to then dispense with the second factor and the skew 
shift altogether. However, we have left this example in here to emphasize that purely quadratic phase 
functions such as e{qa'^v?) are indeed examples of nilcharacters. 
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Removing the sum over the atoms (of which there are at most 5“^ ) by the pigeonhole 

principle this implies that 

\^n&'L/N'L{T^f{^Au:)N,g,xo)\ ^ exp( — 2 ?] 

That is, if H/Ht/s is large then / correlates with the set of return times of a 2-step 
nilsequence to an atom. 

Remark. The space of quadratic nilsequences forms an algebra, being closed under 
multiplication, addition, subtraction, and conjugation. This allows one to employ an 
“energy incrementation” argument of the type used in §7] in order to decompose an 
arbitrary bounded function on Z/iVZ as the sum of a bounded function with norm 
smaller than some specihed rj, plus a 2-step nilsequence with dimension and Lipschitz 
constant controlled by functions of r]. In the ergodic theory setting an extremely similar 
decomposition was obtained in [21. Informally speaking, the quadratic nilsequences form 
a characteristic factor for the norm, and hence for any expression controlled by that 
norm, and thus many questions involving such expressions can be reduced to questions 
concerning 2 -step nilsequences. 

It is perhaps of interest to briefly discuss a decomposition of this type for the f/^ norm, 
where the availability of harmonic analysis allows one to proceed more directly. If 
/ : Z/A^Z —>■ P is a function then we write R := {r : |/(r)| ^ £ 1 } for some suitable Si, 
and dehne l3{x), E/3 = 1, to be a normalized and suitably smoothed version of l_B(a;), 
where B := R(i?,£ 2 )- We then decompose 

f{x) = f*f3+{f-f*f3). (12.12) 

It is easy to check that \\f — f * P\\u^ is small, but to write / * /3 as a 1-step nilsequence 
it must be modihed slightly. To do this, write 

f *P{x) = E f{r)(3{r)e{—rx/N) + ^ /(r)/3(r)e(—rx/iV), (12.13) 

rSA 

where R := {rriiri G R, \mj\ ^ M} for some M. Now if f3 is sufficiently 

smoothed and if M is sufficiently large then 

r^R 

and so the second term in ()12.13j) is bounded by £ 3 , and in particular has small 
norm. The hrst term can be written as a 1-step nilfunction, the underlying nilmanifold 
being (M/Z)'^ and the rotation T being (xi,...,x^) h-^ (xi + ri/N,... ,Xd + ra/N). See 
[ 22 ] for an application of such a decomposition (there the language of nilsystems and 
ergodic theory did not feature, and the simpler decomposition (I12.12|l was used). 

13. Future prospects and open questions 

It is natural to ask whether there are inverse theorems for the higher f/^(G)-norms, 
fc ^ 4, which generalize Theorems 12.71 llO.bl and 112.81 We are certain that the answer 
to this question is “yes”. It is easy to guess at the correct generalization of Theorem 
dl which should simply involve replacing 2 -step nilmanifolds by {k — 2 )-step ones. 
Guessing at the generalization of Theorem 12.71 is a bit harder. We suspect that the 
correct objects to consider for the 7/^(G)-norm are of the form lQ(x)e(—A(x)), where 
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now Q := {x : '0i(a;),...,~ 1} is a quadratic Bohr set, each -0^ being of the 
form = lBj{x)e{—4>j{x)) appearing in Theorem 12.71 The phase A : Q — > R/Z is 

now cubic. It is easy to guess how functions appropriate for the U^{G) norm may be 
constructed inductively. 

We think it likely that most of the ingredients necessary to prove such inverse theorems 
may be found in ^7\, and we intend to pursue this direction. The same major difficulty 
that Gowers encountered in dealing with the U^{G) norm for A; ^ 4 is also present here. 
Suppose that / has large U^{G) norm. This means that T^ff has large U^{G) norm 
for many values of h. Applying Theorem 12.71 we obtain a Bohr set for each of these 
h, such that T^h has large quadratic bias on several shifts of this Bohr set Bh- The 
problem is that we do not, a priori, have any control on how the Bohr set Bh depends 
on h. 

Another interesting issue is that of obtaining better bounds in Theorems 12.,11 and The¬ 
orem o It is quite possible that the codimension of W in Theorem 12.,41 can be taken 
to be (9(log(l/77)) rather than 0{t]~^). This would give bounds of the form 

ll/l|c/3(Fg) ^ I|/IU3(F^) ^ \\f\\u3(W") (13.1) 

for some absolute constant c. Such a bound would be a consequence of the Polynomial 
Freiman-Ruzsa Conjecture (PFR), which is discussed in detail in jHI], together with 
some mild adjustments to the arguments of jlHl We refer to the statement pd.lj) as the 
Polynomial Gowers Inverse Conjecture (PGI) for Fg. 

It would be nice to have a version of (fmi) in a general G. What we mean by this is a 
statement of the form 

WfWuHG) |E/(x)lB(x)e(-0(x))| ^ c{r]), 

B = B{S,p), which may be reversed with only polynomial losses in the constants, that 
is to say 

\Ef{x)lB{x)e{-(j){x))\ ^ c{t]) WfWuHG) > ■ 

This would seem to require that we can take [S'! = 0(log(l/77)) and p greater than some 
absolute constant^®. The methods of this paper seem to fall a long way short of proving 
such a statement. Even if one had an appropriate analogue of PFR (which might take 
the form of a stronger version of Lemma fb.41 in which the size of S is logarithmic in S), 
we would have to find a way to avoid repeatedly passing to smaller Bohr sets as in 
Each such passage causes too much degradation in p. 

Let us conclude this section by remarking that working out how to drop the restriction 
(|G|,6) = 1 in Theorem 12.71 would be a diverting exercise at least, though we cannot 
think of any applications. The case G = probably captures the essence of the 
problem^®. 


^^even then one would need to replace 1b with something smoother, to avoid the losses in the 
argument at the beginning of m 

^^The authors have recently learnt that Samorodnitsky m has resolved this issue. 
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14. Appendix: algebraic constraints on nilmanifolds 

In this section^° we prove Lemma 112.71 For some further discussion of issues related to 
such constraints, see IHlEniEIlEaEl. 

By replacing g, xhj g^, T^x respectively our task is to demonstrate, given any A; — 2-step 
nilmanifold G/F, the existence of a compact set S C (G/F)^“^ and a continuous map 
F : E — G/T such that 

{x, TgX, ..., T^~^x) e E; P{x, TgX, ..., T^~^x) = for all ^ G G, x G G/F. 

(14.1) 

Remark. One can prove (imi) by direct algebraic computation in the cases /c = 3,4. 
Indeed, when k = 3 the 1-step nilpotent group G is abelian, as is the subgroup F, 
so G/F is also a group (indeed it is a torus). One can then take E = (G/F)^ and 
P{a,b) := a{a~^h)‘^. When k = 4, so that G is a 2-step nilpotent group, things are a 
little more complicated. One needs to take 

E := {(xoF,XiF,X 2F) : Xa G Xo(xo ^x^^Ga}; 

this reflects the fact that G/Ga is a 1-step nilpotent group and thus obeys the k = 3 
constraints. Note that Ga commutes with all elements of G and is thus easy to quotient 
out. One can then define F : E —G/F by setting 

F(aF,6F,cF) := a(a“^6)^((a“^6)“^a“^c)^F whenever (aF,6F,cF) G E; 

one can verify with some effort (using of course the fact that all commutators he in Ga, 
which commute with all elements of G) that this function is well-defined, continuous on 
E, and obeys (HHH). Unfortunately in the k > 4 case it seems that the function F is 
significantly messier, and in particular requires choosing a partial inverse for projection 
maps Gj H-> Gj/r for all j < fc — 2, which in general cannot be done canonically when 
J>2. 

To prove (114. 1|1 in the general case, we first need some notation. 

Definition 14.1 (Continuous right invertibility). Let M,N be compact spaces, let 
71 : M ^ N he a continuous map, and let E C M. We say vr is continuously right- 
invertible on E if for every w G vr(E) there exists there exists a neighbourhood 14; C A of 
w and a continuous map 7r“^ ■. Vw ^ M such that ovr is the identity on En7r“^(14,)- 

Lemma 14.2. Let G/F be a {k — 2)-step nilmanifold, and let tt : {xo,... ,Xk) —>■ 
(xo,..., Xfc-i) be the canonical projection from (G/F)^ to (G/F)^“^. Then vr is con¬ 
tinuously right-invertible on the set 

A := {(x, TgX, ..., T;-^x, T^gX) : x G G/F, g G G}. 

The existence of a constraint dTiH) then follows by taking E to be the closure of vr(A) 
and using a compactness argument (exploiting the fact that 7r(A) is dense in the compact 
set E) to glue the various local right-inverses nff together. 


^^The authors are indebted to Sasha Liebman and Tamar Ziegler for conversations which were very 
helpful in preparing this appendix. 










AN INVERSE THEOREM FOR THE GOWERS U^{G) NORM 69 

Proof. For any — 1, we define the Hall-Petresco groups HP^i C to be 

the sets 

HPk,i ■= {(a^i • • • X^'p2^)o^n^k-l '■ Xi G Gj, . . . , Xk-2 £ Gk-2}^ 

where we have the conventions that Gq := Gi = G, and that (p = 0 if j > n. Thns for 
instance when /c = 4 we have 

//F4,3 = {(1,1,1,1)} 

HP4^2 = {(1, 1, T2, xl) : X2 G G 2 } 

HP.i l = {{)-■, Xi,x\x2,x\xl) : Xi G Gi,X2 G G 2 } 

HPifl = {{xq,XoXi,Xox\x2,Xox\xI) : Xq G Go,a:i G Gi,X2 G G 2 }. 

It is well known (see ^3]) that the HP^^i are all snbgronps of G^, and we also have the 
nesting HP^^i+i C HP^^i for all 0 ^ f < A; — 1. 

Observe that F^ is a snbgronp of , so we may form the qnotient space G*^/F*^, which we 
identify with the compact manifold (G/F)*^. Inside this space we have the snbmanifolds 
HPk^i/T^ for all 0 ^ i ^ A; — 1. Observe that if t G G/F and g E G, and 1 / G xF C G is 
any representative of x in G, then 

{x,TgX,...,T;x)T^ = {y,gy,...,g'^y)T^ 

= {y,...,y){l,g,..., C HP^gHP^pT^ 

C HPkpT^ 

and hence 

{x,TgX,...,T;x)eHPk,o/TK 

It thns snffices to show that vr is right-invertible on HPkg/T^. 

We shall show indnctively, by backwards indnction on i, that vr is continnonsly right- 
invertible on HPk^i/T^ for all 0 ^ A ^ A: — 1. The case i = A; — 1 is trivial since HPk^i/T^ 
is jnst a point. Now snppose indnctively that 0 ^ f < A: — 1 and that tt was already 
shown to be continnonsly right-invertible over HP^ 

Let (^05 • • • 5 £ 7 r(iLPfc j/F*^). Observe that the hrst i coefficients of 2 ; mnst be the 

origin O G G/F, dehned as the image of the identity 1 G G, and the coefficient Zi lies 
in the closed manifold Gj/F. 

The projection map vTj : Gj 1 —^ Gj/F is continnons and snrjective from the manifold Gj 
to the manifold Gj/F, which is a snb-manifold of G/F. Thns we may hnd a continnons 
fnnction f Vz^ ^ Gi dehned on a neighbonrhood Vz^ C G/F of Zi snch that TTj o / is 
the identity on Vz^ 0 (Gj/F). 

We need to right-invert vr on HP^^i/T^ in a neighbonrhood of Tr{z). To this end, let 
X := (xi, ... ,Xk) G HPk,i/T^ be snch that 7r(x) be close to 7i{z); in particnlar we may 
take Xi G 14^- As before we have Xn = O for n < i, and Xn G Gj/F for all n ^ i. Thns 
Xi G I4i n (Gj/F) and hence vTj o /(xj) = Xj, or in other words Xj = /(xi)F. Now let 
F(xi) G HPk^i be the gronp element 

F{Xi) := ((/(Xi)(*))o^n^fc-l), 
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and observe that this depends continuously on Xj, and hence on 7r(a;), if 7r(a;) lies in a 
neighborhood of 7r(z). 

On the other hand, since x G there exists g = (1,. .., 1, gfj,..., G HP^ i 

such that gV^ = t; in particular, gi G Gi and gi G XiV. Since f{xi) G Gi and f{xi) G XiV, 
we conclude that f{xi)~^gi lies in both Gi and in F. Thus if we let g G HP^ i be the 
group element 

9 = {{{f{xi)~^gi)^^))o^n^k-i) 
then g lies in both P[Pk,i and F^. Thus we can factorize 

g = F{xi)hg 

where h lies in HPk^i, and also has component equal to the identity. Thus h in fact 
lies in HPk^i+i- Multiplying on the right by F^, we conclude that 

x = gT^ = F{xi)h~gT^ = F{xi)hT^ 

and hence 

F{xi)x = G HPk,i+i/T\ 

Since 7r(a;) is close to 77 ( 2 ;), and F{xi) depends continuously on 7r(a;), we see that 
n{F{xi)x) is close to n{F{zi)z). In particular, by the induction hypothesis we can 
hnd a continuous map 7r“^V(2 )^) iiiapping a neighborhood C (G/F)^ of 7 r{F{zi)z) 

to HPk^i+i/T^ which is a local right-inverse of vr on HPk^i+i/T^ Fl {VF(zi)z)■ Thus we 

have 

F{Xi)x = hV^ = 

and hence 

a; = F{xi)-^n-\^,^^{n{F{xi)x)). 

Observe that n{F{xi)x) = n{F{xi))n{x), where n : G^ ^ G^~^ is the canonical projec¬ 
tion. Since Xi of course depends continuously on 7r(a;), the right-hand side then depends 
continuously on n{x) when n{x) lies in a sufficiently small neighbourhood of 7 r(z). We 
have achieved a right-inverse for tt on HP^^i/T^ in a neighborhood of 7i{z), thus closing 
the induction. □ 
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